Question: Parallelizing An Ngs Mapping Workflow
gravatar for Chris Berthiaume
8.5 years ago by
Chris Berthiaume10 wrote:
Hello, I'd like to use Galaxy on our local beowulf cluster for NGS workflows. One typical use case we'd be replacing with Galaxy is a parallel BWA alignment of large fastq files. To distribute this across the cluster we split the fastq file into many parts, run each separately against the same reference, and then use samtools to merge the SAM output. It's not uncommon to end up with hundreds of parts after splitting. How does Galaxy handle the parallelization of large NGS mappings? I've found the tools for fastq QC, mapping, and SAM merging, but couldn't find any set of tools that would control the parallelization. This trouble ticket ( central/issue/197/starting-workflows-with-a-pool-of-input) would suggest this functionality hasn't been implemented yet, but it seems necessary for many (most?) Illumina or SOLiD runs to get a reasonable mapping turnaround time. If this is already a feature it would be great if I could be pointed to the relevant docs and maybe it could be given a more prominent place in the wiki/interface. If it's not yet a feature, is there a timeline for when it will be added? Thanks, Chris -- Chris Berthiaume Center for Environmental Genomics University of Washington
bwa alignment samtools bam • 888 views
ADD COMMENTlink modified 8.4 years ago by Nate Coraor3.2k • written 8.5 years ago by Chris Berthiaume10
gravatar for Nate Coraor
8.4 years ago by
Nate Coraor3.2k
United States
Nate Coraor3.2k wrote:
v! Hi Chris, This is a long standing feature request which has a ticket here: Unfortunately, still no timeline on when it'll be implemented, but it's moving up on the list of priorities. --nate
ADD COMMENTlink written 8.4 years ago by Nate Coraor3.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 169 users visited in the last hour