Question: Large Sample Workflow Question
0
gravatar for dm21953
4.3 years ago by
dm219530
United States
dm219530 wrote:

We are using a local instance of Galaxy, and have about 70 samples we need to run through a workflow. As of now the only way we know how to do this is to make lines for each distinct sample which would amount to about 70 different workflows in one. The samples will be running on parallel tracks through a few different jobs, and eventually merging into CuffMerge. We are curious if there is a way to loop through all of these samples instead of manually indicating each one in what would be a very chaotic and time consuming work flow, and how we would get around the fact that CuffDiff and CuffMerge would take about 80 inputs each from this, which would make it unwieldy.

tophat cufflinks galaxy • 903 views
ADD COMMENTlink modified 4.3 years ago by jmchilton1.1k • written 4.3 years ago by dm219530
1
gravatar for jmchilton
4.3 years ago by
jmchilton1.1k
United States
jmchilton1.1k wrote:

This sounds like the exact kind of workflows dataset collections (http://bit.ly/gcc2014workflows) included with the latest couple releases of Galaxy have designed to addressed. These features are still need a lot of polish and probably bug fixes - but in theory you should be able to create a workflows that operates over any number of samples in the first several steps and merges them down in the CuffDiff/CuffMerge steps. This would require some modifications to the CuffDiff and CuffMerge tools - hopefully many other tools in the workflow would not need to be modified. I could send you the modifications for the CuffMerge - CuffDiff is a bit trickier - is this a couple conditions with many replicates - or many replicates - or what? If you have a fully expanded version of the workflow with the 80 inputs or a smaller representative workflow with a few I could potentially extrapolate what a modified CuffDiff tool would need to look like (and let you know if I see any other potential pitfalls).

Eventually these tools need to be updated in the tool shed as well - if we get them working well for you - I can try to work on getting whatever changes we make into the tool shed. 

Alternatively - what many people have traditionally done to do this kind of work in Galaxy is just to stop the workflow before the CuffDiff/CuffMerge step and run the 80 copies of the simpler workflow (using the batch workflow submission button - let me know if you need more information on that) and then do the merge and diff steps as simple tool executions and then run another workflow for the post-merge analysis.

 

ADD COMMENTlink written 4.3 years ago by jmchilton1.1k

Made some progress on this here if anyone wants to test - https://github.com/galaxyproject/tools-devteam/pull/20.

ADD REPLYlink written 4.0 years ago by jmchilton1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 178 users visited in the last hour