Is it possible to use loops (or cycles) in Galaxy workflow? I have a file with multiple fasta sequences and I need to process them one-by-one with a Galaxy tool (e.g. EMBOSS cusp tool). The tool normally just allows to process all sequences together (concatenated).
Hi,
loops are currently not possible, but the Galaxy Team is actively working on that. Dataset collections, currently scheduled for the next Galaxy release in one month, will address your use case I think.
In the meantime try to insert the following snipped into your emboss tool.
<parallelism method="multi" split_inputs="infile" split_mode="to_size" split_size="1" merge_outputs="TODO"></parallelism>
Adopt the split_inputs and merge_outputs to the variables in your emboss wrapper. The snippet will automatically split your multi FASTA file into simgle FASTA files and concatenate the output of all runs into one single output. The only requirement is that the outputs can be concatenated.
Dataset collections will not cover this use case, there is just a single file in this use case. Also the parallelism mechanism is really designed to address splitting for optional performance reasons - not splitting as a necessity - for this reason it is off by default and so tools depending on this would not work (by default). My recommendation would just be to write a simple Python/Perl/sh wrapper script to consume the file and repeatedly call the underlying tool.