Galaxy running for too long

Question: Galaxy running for too long

3 months ago by

alpac • 0

alpac • 0 wrote:

Hi Everyone, as i made a few 16S workflow i was asked to test Galaxy and see its strengths and weaknesses. I run my pipeline on my 36 samples and it failed due to to much information, which I can understand I have more than 12 millions sequences on my samples. So I reduced my number of samples to 13 which make 2 millions sequences, but at a specific step ( pre.cluster ) , it doesn't tell me an error, but keep going and for now more than 15h !! I'm wondering is it still to much ? How can i Know that something is going on and that i m not wasting my time here ! Thank you very much for your answer !

galaxy • 155 views

ADD COMMENT • link •

modified 3 months ago • written 3 months ago by alpac • 0

Hi thank you for your answer ! Yes indeed i'm working on usegalaxy.org. It finished an hour ago ! But i decided to reduce again my samples, But i want to know is how much data is "tolerate" by galaxy ? Thank you very much !

ADD REPLY • link written 3 months ago by alpac • 0

Your limit on usegalaxy.org will most probably be your space quota (250GB by default). Galaxy can work with much bigger files but you have to provide the infrastructure. see https://galaxyproject.org/choices/

ADD REPLY • link written 12 weeks ago by Martin Čech ♦♦ 4.9k

Jobs will use resources in Galaxy similar to what they would use when running the same exact job line-command. The memory used and how long the job runs are dependent on the factors such as the tool itself (some use more resources than others, wherever executed), parameters selected, and data content (query input and targets, if included).

When using a public Galaxy server, the resources allocated for tools can vary by server. To find out if your data will run successfully at any, you'll need to do execute test jobs/workflows and review the outcome: success; failed for server-resource reasons; or failed for reasons that would also trigger the job to fail line-command (under the same conditions). It is possible to construct jobs that will run out of resources no matter where it is run, Galaxy or line-command, even if given unlimited resources. Review the underlying 3rd party wrapped tool to better understand how it works, limitations, and best practice usage.

If jobs are too large to run at any particular public Galaxy server, you can try others. And if you cannot find a public server that has sufficient resources to process your data, you can set up your own Galaxy and allocate the resources needed by your chosen data/tools.

FAQs:

https://galaxyproject.org/support/#troubleshooting
https://galaxyproject.org/public-galaxy-servers/
https://galaxyproject.org/choices/#which-option-to-choose
https://galaxyproject.org/learn/
https://galaxyproject.github.io/ >> Cloudman is a popular choice and offers acedemic grants to help cover the AWS costs (Galaxy itself is always free)

Thanks! Jen, Galaxy team

ADD REPLY • link written 12 weeks ago by Jennifer Hillman Jackson ♦ 25k

Similar posts • Search »