Question: Galaxy running for too long
gravatar for alpac
3 months ago by
alpac0 wrote:

Hi Everyone, as i made a few 16S workflow i was asked to test Galaxy and see its strengths and weaknesses. I run my pipeline on my 36 samples and it failed due to to much information, which I can understand I have more than 12 millions sequences on my samples. So I reduced my number of samples to 13 which make 2 millions sequences, but at a specific step ( pre.cluster ) , it doesn't tell me an error, but keep going and for now more than 15h !! I'm wondering is it still to much ? How can i Know that something is going on and that i m not wasting my time here ! Thank you very much for your answer !

galaxy • 155 views
ADD COMMENTlink modified 3 months ago • written 3 months ago by alpac0

Hi thank you for your answer ! Yes indeed i'm working on It finished an hour ago ! But i decided to reduce again my samples, But i want to know is how much data is "tolerate" by galaxy ? Thank you very much !

ADD REPLYlink written 3 months ago by alpac0

Your limit on will most probably be your space quota (250GB by default). Galaxy can work with much bigger files but you have to provide the infrastructure. see

ADD REPLYlink written 12 weeks ago by Martin Čech ♦♦ 4.9k

Jobs will use resources in Galaxy similar to what they would use when running the same exact job line-command. The memory used and how long the job runs are dependent on the factors such as the tool itself (some use more resources than others, wherever executed), parameters selected, and data content (query input and targets, if included).

When using a public Galaxy server, the resources allocated for tools can vary by server. To find out if your data will run successfully at any, you'll need to do execute test jobs/workflows and review the outcome: success; failed for server-resource reasons; or failed for reasons that would also trigger the job to fail line-command (under the same conditions). It is possible to construct jobs that will run out of resources no matter where it is run, Galaxy or line-command, even if given unlimited resources. Review the underlying 3rd party wrapped tool to better understand how it works, limitations, and best practice usage.

If jobs are too large to run at any particular public Galaxy server, you can try others. And if you cannot find a public server that has sufficient resources to process your data, you can set up your own Galaxy and allocate the resources needed by your chosen data/tools.


Thanks! Jen, Galaxy team

ADD REPLYlink written 12 weeks ago by Jennifer Hillman Jackson25k
gravatar for Martin Čech
3 months ago by
Martin Čech ♦♦ 4.9k
United States
Martin Čech ♦♦ 4.9k wrote:

I assume you are working on

I do not know what tool you are executing that takes this much time, but it is possible for some tools to take this long with large inputs. Generally if the tool runs into trouble it will stop and show an error that you would see in Galaxy interface.

ADD COMMENTlink written 3 months ago by Martin Čech ♦♦ 4.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 176 users visited in the last hour