Question: Does Galaxy NGS tools support targeted sequencing analysis?
gravatar for BC357
3.9 years ago by
United States
BC35770 wrote:

Glad to see Galaxy now provides support to do EXOME-seq analysis from FASTQ to Variant calling. Kudos to the good job! I am curious, as I haven't been able to find much info, if Galaxy already supports, or has a plan to, users to analyze targeted sequencing data (i.e. sequencing data from customized chromosomal regions). 

I am NOT a computer guy so need tools like Galaxy to bridge the gap between FASTQ sequencing data and variant calling. Any help will be greatly appreciated. Thank you.

targeted sequencing ngs • 1.4k views
ADD COMMENTlink written 3.9 years ago by BC35770
gravatar for Bjoern Gruening
3.9 years ago by
Bjoern Gruening5.1k
Bjoern Gruening5.1k wrote:

Do you know of any particular tools that you want to use? Galaxy is only capable of what tools will provide, but tools can be easily added. The Galaxy Tool Shed now hosts more than 1000 different tools, so if your particular tools it not available in Galaxy we can probably include it.

ADD COMMENTlink written 3.9 years ago by Bjoern Gruening5.1k

Thinking of retracting the question, as I found that bedtool may allow me to do what I need (??). It's clear that I do NOT know well (if at all) the NGS Variant calling work flow. Thought that instead of using BWA to align my FASTQ to the whole genome, I could have aligned it to the region of interest (a much faster alignment process). But it seems that some people have been doing this in reverse - using BWA to align to the whole genome first and then using bedtool to convert the data to a bed file for comparison with the bed file containing the region of interest. Still hoping to get more feedback, but in the meantime, to the forum admin, please feel free to remove the question if based on your bioinformatics knowledge bedtool IS indeed the tool for my inquiry. Thanks.

ADD REPLYlink written 3.9 years ago by BC35770

Most of the variant analysis tools will also permit you to limit to specific regions - either through initial alignment or by subsequent filtering/processing with a reference file of "regions" (as you noted). And you can always use a custom reference genome (any fasta file) as a "reference genome". The issue here is that this may skew some statistics with certain tools, but that again depends on the tool and your goals, so is not always important.

Do check out the Tool Shed as Bjoern suggests, if you have a publication with a workflow you are attempting to duplicate. To find existing tools and for help about how to add in your own (or have someone help you to wrap them for Galaxy).

The question is fine to leave here. I'll just reassign one of these posts to an answer - it may help others.

For general variant analysis workflow examples in Galaxy, including those that are isolated to individual chroms, please see: Learn, Teach Shared Pages, Workflows, tool group NGS: Variant Analysis

Dan from our team may add more.

Best, Jen, Galaxy team

ADD REPLYlink written 3.9 years ago by Jennifer Hillman Jackson25k

One issue with aligning to a specific region first instead of aligning to the whole genome and then filtering to the region of interest, is that when you align only to a specific region, you may end up with reads aligning to your region that otherwise would have aligned to a different region with a better alignment score, or the read may have been discarded if e.g. it multiply maps, and you have your aligner set to discard such reads.

ADD REPLYlink written 3.9 years ago by Daniel Blankenberg ♦♦ 1.7k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 171 users visited in the last hour