How do I obtain uniquely mappable reads from bowtie on galaxy?

Question: How do I obtain uniquely mappable reads from bowtie on galaxy?

4.6 years ago by

Jason • 30

United States

Jason • 30 wrote:

Hello,

I'm using bowtie on Galaxy to map my RNA-seq reads to the sacCer2 yeast ref genome. I was wondering which option or flag is responsible for making sure I have an output of uniquely mappable reads?

My workflow is the following:

1. Start with SOLiD output in the form of .csfasta reads and .qual quality files.

2. Use the convert SOLiD output to fastq program to obtain a fastqsanger file

3. Use bowtie to map to the yeast ref (output = sam file with ~ 50 million lines)

4. Use "convert BAM to BED" under the bedtools directory to obtain BED file (output = BED file with ~20 million lines)

I've already mapped this sequencing run using SHRiMP and I got ~20 million uniquely mappable reads for the same sample so I'm inclined to believe my BED file output is my uniquely mappable reads from step 4. However I want to know why my sam file has twice as many lines (see step 3)? And which options dictate whether or not the output is giving me uniquely mappable reads? Furthermore, does galaxy make bowtie provide uniquely mappable reads by default?

Thanks

solid bowtie rna-seq • 3.5k views

ADD COMMENT • link •

modified 4.6 years ago by Bjoern Gruening ♦ 5.1k • written 4.6 years ago by Jason • 30

4.6 years ago by

Bjoern Gruening ♦ 5.1k

Germany

Bjoern Gruening ♦ 5.1k wrote:

Hi,

it seems you are searching for the -m option. -m will suppress all alignments for a read if more than n reportable alignments exist. You will find it under "Bowtie settings to use:" → "Full parameter list". The -m parameter is only in bowtie available as of bowtie2 -m is deprecated and now the default. So as far as I know bowtie2 will report by default only uniquely mapped reads.

Cheers,

Bjoern

ADD COMMENT • link written 4.6 years ago by Bjoern Gruening ♦ 5.1k

Remember that if you want to stay with BAM/SAM format (and all the additional information contained within), tools in the group "NGS: SAM Tools" can be used. Try: BAM->SAM, Filter SAM, SAM->BAM. Once you have an analysis path you like, capture it in a Workflow for reuse (to eliminate the tedious and enhance reproducibility) -Jen

ADD REPLY • link modified 4.6 years ago • written 4.6 years ago by Jennifer Hillman Jackson ♦ 25k

Thanks, I figured it might be that but I wasn't sure. Also, do you know why the lines in my output went from 50 million to 20 million when I went from sam to bed?

ADD REPLY • link written 4.6 years ago by Jason • 30

unmapped reads will be dropped, those can't be turned into BED records

ADD REPLY • link written 4.6 years ago by Istvan Albert ♦ 250

Similar posts • Search »