Question: Parallelizing An Ngs Mapping Workflow
0
Chris Berthiaume • 10 wrote:
Hello,
I'd like to use Galaxy on our local beowulf cluster for NGS workflows.
One typical use case we'd be replacing with Galaxy is a parallel BWA
alignment of large fastq files. To distribute this across the cluster
we split the fastq file into many parts, run each separately against
the same reference, and then use samtools to merge the SAM output.
It's not uncommon to end up with hundreds of parts after splitting.
How does Galaxy handle the parallelization of large NGS mappings?
I've found the tools for fastq QC, mapping, and SAM merging, but
couldn't find any set of tools that would control the parallelization.
This trouble ticket (http://bitbucket.org/galaxy/galaxy-
central/issue/197/starting-workflows-with-a-pool-of-input) would
suggest this functionality hasn't been implemented yet, but it seems
necessary for many (most?) Illumina or SOLiD runs to get a reasonable
mapping turnaround time. If this is already a feature it would be
great if I could be pointed to the relevant docs and maybe it could be
given a more prominent place in the wiki/interface. If it's not yet a
feature, is there a timeline for when it will be added?
Thanks,
Chris
--
Chris Berthiaume
Center for Environmental Genomics
University of Washington
ADD COMMENT
• link
•
modified 8.4 years ago
by
Nate Coraor ♦ 3.2k
•
written
8.5 years ago by
Chris Berthiaume • 10