Question: How To Run A Pipeline On Many Data Sets ?
0
gravatar for Jean-François Dufayard
8.1 years ago by
Dear Galaxy users, I would like to do a quite simple operation, in theory: I've configured a Galaxy pipeline on a local Galaxy server (installed in a Sun Grid Engine cluster), and I would like to run it on several datasets (several thousands, in a directory) and get result files in another directory. With the web interface, using libraries or not, I didn't found any solution. Does a simple solution exist ? Or anybody have experienced the same problem ? Sincerely yours, -- Jean-François Dufayard Research engineer - ARCAD project CIRAD - Montpellier - France
galaxy • 1.0k views
ADD COMMENTlink modified 8.1 years ago by Jennifer Hillman Jackson25k • written 8.1 years ago by Jean-François Dufayard30
0
gravatar for Jennifer Hillman Jackson
8.1 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hello Jean-Francois, The Galaxy wiki describing production set up should help you to develop a solution, but please let us know if you need more help. General: http://bitbucket.org/galaxy/galaxy-central/wiki/Home -> For tool developers and labs Specific: http://bitbucket.org/galaxy/galaxy- central/wiki/Config/ProductionServer Best! Jen Galaxy team -- Jennifer Jackson http://usegalaxy.org
ADD COMMENTlink written 8.1 years ago by Jennifer Hillman Jackson25k
Hi, I'm also very interested in how to loop over multiple datasets. Although the info below is important to make a Galaxy scale to serve many users simultaneously, I don't see how this will help to provide looping support. You'll still need to manually configure the same tool 1000 times and start 1000 jobs if you want to analyze 1000 files. With the current web interface this ain't much fun... Or am I missing something? Cheers, Pi Biomolecular Mass Spectrometry & Proteomics group Utrecht University phone: +31 6 143 66 783 email: pieter.neerincx@gmail.com skype: pieter.online visiting address: H.R. Kruyt building // room O607 Padualaan 8 // 3584 CH Utrecht // The Netherlands mail address: P.O. box 80.082 // 3508 TB Utrecht // The Netherlands
ADD REPLYlink written 8.1 years ago by Pieter Neerincx360
Follow-up for the original question that would apply to yours as well. -- You are correct in that the simplest approach for this would be to specify multiple inputs at runtime. This is a feature that does not currently exist, but I'll be working on it soon. You can follow the ticket here: http://bitbucket.org/galaxy/galaxy-central/issue/409/static-and- library-inputs-for-workflows -Dannon -- Best, Jen Galaxy team -- Jennifer Jackson http://usegalaxy.org
ADD REPLYlink written 8.1 years ago by Jennifer Hillman Jackson25k
0
gravatar for Jennifer Hillman Jackson
8.1 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hello again, A Composite Datatype loaded into a history can be an input into a workflow in your instance: https://bitbucket.org/galaxy/galaxy-central/wiki/CompositeDatatypes Hopefully this helps! Thanks, Jen -- Jennifer Jackson http://usegalaxy.org
ADD COMMENTlink written 8.1 years ago by Jennifer Hillman Jackson25k
Hello, If I understand well the concept of composite datatype, it doesn't seem possible to run X occurences of a pipeline with it, It seems that I must create new bricks and pipeline, adapted to composite datatypes. My needs are a lot simpler. I've built a pipeline, it takes a fasta file as an input, and return after several bricks a statistic file and a newick tree. I want to be able to quickly and easily run this pipeline on an important number of fasta files contained in a directory, and get the statistic and newick files easily. In my opinion, the most natural way to do that with Galaxy should to be able to run a pipeline on a complete directory of a Galaxy library. To be honest, I was quite surprised not finding this option. Did I miss something with composite datatypes ? Is there any simple solution to do this simple task ? Thanks a lot, 2010/11/3 Jennifer Jackson <jen@bx.psu.edu> -- Jean-François Dufayard Research engineer - ARCAD project CIRAD - Montpellier - France
ADD REPLYlink written 8.1 years ago by Jean-François Dufayard30
You are correct in that the simplest approach for this would be to specify multiple inputs at runtime. This is a feature that does not currently exist, but I'll be working on it soon. You can follow the ticket here: http://bitbucket.org/galaxy/galaxy-central/issue/409/static-and- library-inputs-for-workflows -Dannon
ADD REPLYlink written 8.1 years ago by Dannon Baker3.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 171 users visited in the last hour