Question: How do I obtain uniquely mappable reads from bowtie on galaxy?
1
gravatar for Jason
4.1 years ago by
Jason30
United States
Jason30 wrote:

Hello,

I'm using bowtie on Galaxy to map my RNA-seq reads to the sacCer2 yeast ref genome. I was wondering which option or flag is responsible for making sure I have an output of uniquely mappable reads? 

My workflow is the following:

1. Start with SOLiD output in the form of .csfasta reads and .qual quality files.

2. Use the convert SOLiD output to fastq program to obtain a fastqsanger file

3. Use bowtie to map to the yeast ref (output = sam file with ~ 50 million lines)

4. Use "convert BAM to BED" under the bedtools directory to obtain BED file (output = BED file with ~20 million lines)

I've already mapped this sequencing run using SHRiMP and I got ~20 million uniquely mappable reads for the same sample so I'm inclined to believe my BED file output is my uniquely mappable reads from step 4. However I want to know why my sam file has twice as many lines (see step 3)? And which options dictate whether or not the output is giving me uniquely mappable reads? Furthermore, does galaxy make bowtie provide uniquely mappable reads by default? 

Thanks

solid bowtie rna-seq • 3.1k views
ADD COMMENTlink modified 4.1 years ago by Bjoern Gruening5.0k • written 4.1 years ago by Jason30
3
gravatar for Bjoern Gruening
4.1 years ago by
Bjoern Gruening5.0k
Germany
Bjoern Gruening5.0k wrote:

Hi,

it seems you are searching for the -m option. -m will suppress all alignments for a read if more than n reportable alignments exist. You will find it under "Bowtie settings to use:" → "Full parameter list". The -m parameter is only in bowtie available as of bowtie2 -m is deprecated and now the default. So as far as I know bowtie2 will report by default only uniquely mapped reads.

Cheers,

Bjoern

ADD COMMENTlink written 4.1 years ago by Bjoern Gruening5.0k
1

Remember that if you want to stay with BAM/SAM format (and all the additional information contained within), tools in the group "NGS: SAM Tools" can be used. Try: BAM->SAM, Filter SAM, SAM->BAM. Once you have an analysis path you like, capture it in a Workflow for reuse (to eliminate the tedious and enhance reproducibility) -Jen

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by Jennifer Hillman Jackson25k

Thanks, I figured it might be that but I wasn't sure. Also, do you know why the lines in my output went from 50 million to 20 million when I went from sam to bed? 

ADD REPLYlink written 4.1 years ago by Jason30
2

unmapped reads will be dropped, those can't be turned into BED records

ADD REPLYlink written 4.1 years ago by Istvan Albert250
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 102 users visited in the last hour