Question: Batch Jobs...
0
gravatar for Neil.Burdett@csiro.au
6.1 years ago by
Neil.Burdett@csiro.au310 wrote:
Hi, I've created a workflow that works well. The workflow accepts 2 input files. After uploading all my input files (there is a lot). How can I batch the process, because I don't want to select files manually and run each job manually. Is there a way to batch this action? Thanks Neil
• 1.8k views
ADD COMMENTlink modified 6.1 years ago • written 6.1 years ago by Neil.Burdett@csiro.au310
0
gravatar for Björn Grüning
6.1 years ago by
Björn Grüning350 wrote:
Hi Neil, if you run your workflow, there is small little symbol/icon (looks like papers). If you click these ... you can select multiple input files at once. Also you can probably use the Galaxy API for your task [1]. Happy research! Bjoern [1] http://wiki.g2.bx.psu.edu/Learn/API -- Björn Grüning Albert-Ludwigs-Universität Freiburg Institute of Pharmaceutical Sciences Pharmaceutical Bioinformatics Hermann-Herder-Strasse 9 D-79104 Freiburg i. Br. Tel.: +49 761 203-4872 Fax.: +49 761 203-97769 E-Mail: bjoern.gruening@pharmazie.uni-freiburg.de Web: http://www.pharmaceutical-bioinformatics.org/ Tel.: +49 761 203-4872 Fax.: +49 761 203-97769 E-Mail: bjoern.gruening@pharmazie.uni-freiburg.de Web: http://www.pharmaceutical-bioinformatics.org/
ADD COMMENTlink written 6.1 years ago by Björn Grüning350
On Fri, Oct 26, 2012 at 7:46 AM, BjĂśrn GrĂźning Well this would only work if one of the 2 input files Neil mentions is fix. Multiple input files selection doesn't work for workflows with more than one input files, unless there is only one input changing in each run and the rest are fix. At least as far as I know. Using the API might be your only option if this is the case. --Carlos
ADD REPLYlink written 6.1 years ago by Carlos Borroto390
Thanks Bjorn, But I can't seem to locate the symbol/icon (looks like papers). Is there any documentation where it is located on the screen? Thanks Neil To: Burdett, Neil (ICT Centre, Herston - RBWH) Cc: galaxy-user@lists.bx.psu.edu Subject: Re: [galaxy-user] Batch jobs... Hi Neil, if you run your workflow, there is small little symbol/icon (looks like papers). If you click these ... you can select multiple input files at once. Also you can probably use the Galaxy API for your task [1]. Happy research! Bjoern [1] http://wiki.g2.bx.psu.edu/Learn/API -- Björn Grüning Albert-Ludwigs-Universität Freiburg Institute of Pharmaceutical Sciences Pharmaceutical Bioinformatics Hermann-Herder-Strasse 9 D-79104 Freiburg i. Br. Tel.: +49 761 203-4872 Fax.: +49 761 203-97769 E-Mail: bjoern.gruening@pharmazie.uni-freiburg.de Web: http://www.pharmaceutical-bioinformatics.org/ Tel.: +49 761 203-4872 Fax.: +49 761 203-97769 E-Mail: bjoern.gruening@pharmazie.uni-freiburg.de Web: http://www.pharmaceutical-bioinformatics.org/
ADD REPLYlink written 6.1 years ago by Neil.Burdett@csiro.au310
They were introduced and described in this news brief (with screenshots!): http://wiki.g2.bx.psu.edu/DevNewsBriefs/2011_05_20 Hope this helps, Dannon
ADD REPLYlink written 6.1 years ago by Dannon Baker3.7k
0
gravatar for Dannon Baker
6.1 years ago by
Dannon Baker3.7k
United States
Dannon Baker3.7k wrote:
Ahh, I see what's going on. Galaxy relies on the "Input Dataset" step for this feature. If you use these in your workflow, Galaxy will be able to perform the batch execution. Find them in the workflow editor under "Workflow control" -> inputs. -Dannon
ADD COMMENTlink written 6.1 years ago by Dannon Baker3.7k
Thanks Dannon, I can see them now but how do I use them now? As I now have 3 steps: Input File (as before) Input Dataset (new) Atlas list (as before) I select the "Input file" in step 1, and atlases in step 3 (as before) and I can select multiple files for the Input Dataset (but how does that tie into step 1 and 3 as its not connected on the workflow diagram? Neil To: Burdett, Neil (ICT Centre, Herston - RBWH) Cc: bjoern.gruening@pharmazie.uni-freiburg.de; galaxy- user@lists.bx.psu.edu Subject: Re: [galaxy-user] Batch jobs... Ahh, I see what's going on. Galaxy relies on the "Input Dataset" step for this feature. If you use these in your workflow, Galaxy will be able to perform the batch execution. Find them in the workflow editor under "Workflow control" -> inputs. -Dannon
ADD REPLYlink written 6.1 years ago by Neil.Burdett@csiro.au310
Connect the Input Dataset workflow step to the dataset input of the step you'd like to run multiple inputs across, like below. With that example workflow, I can select a batch of inputs that will *all* be mapped with BWA.
ADD REPLYlink written 6.1 years ago by Dannon Baker3.7k
Okay cool thanks Neil To: Burdett, Neil (ICT Centre, Herston - RBWH) Cc: bjoern.gruening@pharmazie.uni-freiburg.de; galaxy- user@lists.bx.psu.edu Subject: Re: [galaxy-user] Batch jobs... Connect the Input Dataset workflow step to the dataset input of the step you'd like to run multiple inputs across, like below. With that example workflow, I can select a batch of inputs that will *all* be mapped with BWA. [cid:image001.png@01CDB759.9DBA0A40] Thanks Dannon, I can see them now but how do I use them now? As I now have 3 steps: Input File (as before) Input Dataset (new) Atlas list (as before) I select the "Input file" in step 1, and atlases in step 3 (as before) and I can select multiple files for the Input Dataset (but how does that tie into step 1 and 3 as its not connected on the workflow diagram? Neil To: Burdett, Neil (ICT Centre, Herston - RBWH) Cc: bjoern.gruening@pharmazie.uni- freiburg.de<mailto:bjoern.gruening@pharmazie.uni-freiburg.de>; galaxy- user@lists.bx.psu.edu<mailto:galaxy-user@lists.bx.psu.edu> Subject: Re: [galaxy-user] Batch jobs... Ahh, I see what's going on. Galaxy relies on the "Input Dataset" step for this feature. If you use these in your workflow, Galaxy will be able to perform the batch execution. Find them in the workflow editor under "Workflow control" -> inputs. -Dannon Do i need to modify a setting in universe_wsgi.ini or somewhere then ? As I attach screenshots of what I've got (I have a recent subversion checkout)... Thanks Neil <image001.jpg> <image002.jpg>
ADD REPLYlink written 6.1 years ago by Neil.Burdett@csiro.au310
Hi, I have a local galaxy installation. I've created a data library, selected "Upload files from filesystem paths", pasted a path in the "path to upload" window, and I've selected to preserve the directory structure". And the files get imported. How do I now access these files from my application? I don't want to import them into the history as then they lose the directory structure. I can't see where they are physically under the galaxy-dist structure Thanks for any help Neil
ADD REPLYlink written 5.9 years ago by Neil.Burdett@csiro.au310
Try importing those library files to the history where you want them - browse the Galaxy 'shared data' tab to where you uploaded them.
ADD REPLYlink written 5.9 years ago by fubar1.1k
Hi Ross, I think I need to clarify. I have a file in /home/galaxy /data-test/dir1/dir2/somefile.txt Under the "Upload files from filesystem paths", In the "path to upload" window I paste "/home/galaxy/data-test". This then puts the "somefile.txt" in the /home/galaxy/galaxy-dist/database/files/000 directory. However, I elected to keep the directory structure. I can see this if I navigate through the "shared data" tab but where is this information stored under the galaxt-dist structure. As my application needs to have the directory structure kept, so need to access it from the xml/command line I thought it might have been something like: /home/galaxy/galaxy- dist/database/files/000/data-test/dir1/dir2/dataset_id.dat. But this is not the case rather /home/galaxy/galaxy- dist/database/files/000/dataset_id.dat. i.e. no directory structure. So how can I access this information from the xml files in the tools directory? Thanks Neil ________________________________________ To: Burdett, Neil (ICT Centre, Herston - RBWH) Cc: galaxy-user Subject: Re: [galaxy-user] Upload files from filesystem paths Try importing those library files to the history where you want them - browse the Galaxy 'shared data' tab to where you uploaded them. Hi, I have a local galaxy installation. I've created a data library, selected "Upload files from filesystem paths", pasted a path in the "path to upload" window, and I've selected to preserve the directory structure". And the files get imported. How do I now access these files from my application? I don't want to import them into the history as then they lose the directory structure. I can't see where they are physically under the galaxy-dist structure Thanks for any help Neil
ADD REPLYlink written 5.9 years ago by Neil.Burdett@csiro.au310
Neil, It would help if you could point to an existing tool that works the way you want. I don't know of any that deal with arbitrary nested directories containing arbitrary files. A new composite datatype could impose a structure that a tool could be written to deal with (eg the pbed datatype used in some rgenetics tools) but arbitrary data structures are not going to be possible AFAIK. You're unlikely to get useful help without a much more complete and clear explanation of the problem.
ADD REPLYlink written 5.9 years ago by fubar1.1k
Hi Ross, I don't know of any tools that work in the way I want, but I'm not an expert on tools within Galaxy. Essentially the data in the directories will be fixed. We run a tool from Galaxy that generates some output data, this data then "checks" the data located under the directories I am trying to upload to Galaxy. There will probably be around 20 directories, and the data produced would then search these directories looking for "a closest match" once located it would use the remaining files in the directory to complete the process. So for example, the application is segmenting an image, so a part of the image is the output. This is compared with files in the uploaded directories and a file in a particular directory is chosen (as the closest match) then the remaining files in the directory are then used to complete the process. Does that make sense? There would be around 20 files in each directory. Thanks Neil To: Burdett, Neil (ICT Centre, Herston - RBWH) Cc: galaxy-user Subject: Re: [galaxy-user] Upload files from filesystem paths Neil, It would help if you could point to an existing tool that works the way you want. I don't know of any that deal with arbitrary nested directories containing arbitrary files. A new composite datatype could impose a structure that a tool could be written to deal with (eg the pbed datatype used in some rgenetics tools) but arbitrary data structures are not going to be possible AFAIK. You're unlikely to get useful help without a much more complete and clear explanation of the problem. Hi Ross, I think I need to clarify. I have a file in /home/galaxy /data-test/dir1/dir2/somefile.txt Under the "Upload files from filesystem paths", In the "path to upload" window I paste "/home/galaxy/data-test". This then puts the "somefile.txt" in the /home/galaxy/galaxy-dist/database/files/000 directory. However, I elected to keep the directory structure. I can see this if I navigate through the "shared data" tab but where is this information stored under the galaxt-dist structure. As my application needs to have the directory structure kept, so need to access it from the xml/command line I thought it might have been something like: /home/galaxy/galaxy- dist/database/files/000/data-test/dir1/dir2/dataset_id.dat. But this is not the case rather /home/galaxy/galaxy- dist/database/files/000/dataset_id.dat. i.e. no directory structure. So how can I access this information from the xml files in the tools directory? Thanks Neil ________________________________________ To: Burdett, Neil (ICT Centre, Herston - RBWH) Cc: galaxy-user Subject: Re: [galaxy-user] Upload files from filesystem paths Try importing those library files to the history where you want them - browse the Galaxy 'shared data' tab to where you uploaded them. Hi, I have a local galaxy installation. I've created a data library, selected "Upload files from filesystem paths", pasted a path in the "path to upload" window, and I've selected to preserve the directory structure". And the files get imported. How do I now access these files from my application? I don't want to import them into the history as then they lose the directory structure. I can't see where they are physically under the galaxy-dist structure Thanks for any help Neil
ADD REPLYlink written 5.9 years ago by Neil.Burdett@csiro.au310
Hi, Neal, Thanks - that sounds interesting. Like I said, composite datatypes are designed to manage collections of related files as a unit and this sounds like a potential use case. There are lots of tools and lots of code that can serve as examples but it's definitely not trivial because you will almost certainly be subclassing the Html data class and writing methods to manage those related files (ie extending the guts of Galaxy) and your tools will all need to know how to deal with the managed structure when they get one as an input. You may need to find or build up a programmer with some relevant Galaxy composite datatype experience. There is some documentation but it's not extensive or transparent. Good luck.
ADD REPLYlink written 5.9 years ago by fubar1.1k
Thanks for the help Ross. Any chance you can point me to the examples you mentioned? Thanks again Neil To: Burdett, Neil (ICT Centre, Herston - RBWH) Cc: galaxy-user Subject: Re: [galaxy-user] Upload files from filesystem paths Hi, Neal, Thanks - that sounds interesting. Like I said, composite datatypes are designed to manage collections of related files as a unit and this sounds like a potential use case. There are lots of tools and lots of code that can serve as examples but it's definitely not trivial because you will almost certainly be subclassing the Html data class and writing methods to manage those related files (ie extending the guts of Galaxy) and your tools will all need to know how to deal with the managed structure when they get one as an input. You may need to find or build up a programmer with some relevant Galaxy composite datatype experience. There is some documentation but it's not extensive or transparent. Good luck. Hi Ross, I don't know of any tools that work in the way I want, but I'm not an expert on tools within Galaxy. Essentially the data in the directories will be fixed. We run a tool from Galaxy that generates some output data, this data then "checks" the data located under the directories I am trying to upload to Galaxy. There will probably be around 20 directories, and the data produced would then search these directories looking for "a closest match" once located it would use the remaining files in the directory to complete the process. So for example, the application is segmenting an image, so a part of the image is the output. This is compared with files in the uploaded directories and a file in a particular directory is chosen (as the closest match) then the remaining files in the directory are then used to complete the process. Does that make sense? There would be around 20 files in each directory. Thanks Neil To: Burdett, Neil (ICT Centre, Herston - RBWH) Cc: galaxy-user Subject: Re: [galaxy-user] Upload files from filesystem paths Neil, It would help if you could point to an existing tool that works the way you want. I don't know of any that deal with arbitrary nested directories containing arbitrary files. A new composite datatype could impose a structure that a tool could be written to deal with (eg the pbed datatype used in some rgenetics tools) but arbitrary data structures are not going to be possible AFAIK. You're unlikely to get useful help without a much more complete and clear explanation of the problem. Hi Ross, I think I need to clarify. I have a file in /home/galaxy /data-test/dir1/dir2/somefile.txt Under the "Upload files from filesystem paths", In the "path to upload" window I paste "/home/galaxy/data-test". This then puts the "somefile.txt" in the /home/galaxy/galaxy-dist/database/files/000 directory. However, I elected to keep the directory structure. I can see this if I navigate through the "shared data" tab but where is this information stored under the galaxt-dist structure. As my application needs to have the directory structure kept, so need to access it from the xml/command line I thought it might have been something like: /home/galaxy/galaxy- dist/database/files/000/data-test/dir1/dir2/dataset_id.dat. But this is not the case rather /home/galaxy/galaxy- dist/database/files/000/dataset_id.dat. i.e. no directory structure. So how can I access this information from the xml files in the tools directory? Thanks Neil ________________________________________ To: Burdett, Neil (ICT Centre, Herston - RBWH) Cc: galaxy-user Subject: Re: [galaxy-user] Upload files from filesystem paths Try importing those library files to the history where you want them - browse the Galaxy 'shared data' tab to where you uploaded them. Hi, I have a local galaxy installation. I've created a data library, selected "Upload files from filesystem paths", pasted a path in the "path to upload" window, and I've selected to preserve the directory structure". And the files get imported. How do I now access these files from my application? I don't want to import them into the history as then they lose the directory structure. I can't see where they are physically under the galaxy-dist structure Thanks for any help Neil
ADD REPLYlink written 5.9 years ago by Neil.Burdett@csiro.au310
For the simplest case, start with the tools/rgenetics/rgFastQC tool - it doesn't need a subclass but uses the Html datatype files_path as a simple multiple file bucket. Once you've got that all figured out, check out the rgenetics datatypes (eg pbed) subclassed from Html defined in lib/galaxy/datatypes/genetics and the tools that use it (eg TDT or CaCo tools) in tools/rgenetics for more complex hackery keeping related files needed by plink together.
ADD REPLYlink written 5.9 years ago by fubar1.1k
0
gravatar for Neil.Burdett@csiro.au
6.1 years ago by
Neil.Burdett@csiro.au310 wrote:
Do i need to modify a setting in universe_wsgi.ini or somewhere then ? As I attach screenshots of what I've got (I have a recent subversion checkout)... Thanks Neil [cid:image001.jpg@01CDB754.FC6C6330] [cid:image002.jpg@01CDB754.FC6C6330]
ADD COMMENTlink written 6.1 years ago by Neil.Burdett@csiro.au310
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 169 users visited in the last hour