Question: publicly available traces/logs
Are there publicly available traces/logs from Galaxy instances?

For my research I would like to investigate the impact of different storage strategies on a framework like Galaxy. For that it would be nice to have some real usage data. It is okay if users are anonymized and details of the tools are not important either, I'm only concerned with dataset accesses.

Alternatively, if I don't find traces I would have to create synthetic ones. In that case it would be good to have some statistics about usage: datasets size, lifetime, how long after upload will it be accessed again, etc.

I searched for it but did not find anything I could use. Please let me know if you have some suggestion :)

Thanks! Best regards, Francieli

Application logs are usually private by design, since they always contain non-public information like IPs and such. I doubt anybody would share them with you as is. Parsing them and anonymizing them is a lot of work too and prone to errors. If you precisely and atomically specify what data you want we can go through the points and see what can be done.

Hello, Martin, thanks for your reply and I'm sorry about my delay in replying. I understand how problematic it is, I asked because I figured maybe someone already went through that trouble for some reason. For my research, as I'm interested in data management, what I would need is simply information on datasets: size, type (optional), creation time and the times when it was used as input for tools. Thanks! Best regards, Francieli

