I'm using bowtie on Galaxy to map my RNA-seq reads to the sacCer2 yeast ref genome. I was wondering which option or flag is responsible for making sure I have an output of uniquely mappable reads?
My workflow is the following:
1. Start with SOLiD output in the form of .csfasta reads and .qual quality files.
2. Use the convert SOLiD output to fastq program to obtain a fastqsanger file
3. Use bowtie to map to the yeast ref (output = sam file with ~ 50 million lines)
4. Use "convert BAM to BED" under the bedtools directory to obtain BED file (output = BED file with ~20 million lines)
I've already mapped this sequencing run using SHRiMP and I got ~20 million uniquely mappable reads for the same sample so I'm inclined to believe my BED file output is my uniquely mappable reads from step 4. However I want to know why my sam file has twice as many lines (see step 3)? And which options dictate whether or not the output is giving me uniquely mappable reads? Furthermore, does galaxy make bowtie provide uniquely mappable reads by default?