Question: TopHat align summary
gravatar for Diana Afonso
14 months ago by
Diana Afonso10
Diana Afonso10 wrote:


I am using galaxy to analyze RNA seq of 100bp sing end data, sequenced with Illumina 2500. I started by uploading bam files and using the tool "Convert from BAM to FastQ" to convert my data into FastQ format. Next I used "FastQ groomer" to convert my data into one format that contains Sanger-scaled quality values with ASCII. Finally I used TopHat with the default settings and the align summary is the following:

Reads: Input : 14520796 Mapped : 6298973 (43.4% of input) of these: 1080332 (17.2%) have multiple alignments (0 have >20) 43.4% overall read mapping rate.

I think the percentage of mapped sequences is very low. Could you please give me some tips on how I should alter the TopHat parameters to improve my results?

Thank you, Diana

ADD COMMENTlink modified 14 months ago by Jennifer Hillman Jackson23k • written 14 months ago by Diana Afonso10
gravatar for Jennifer Hillman Jackson
14 months ago by
United States
Jennifer Hillman Jackson23k wrote:


Double check the fastq quality score formatting. This is how:

It would be a good idea to run FastQC again if you make adjustments to the reads. That report will alert you to quality problems within the data.

After that, this RNA-seq tutorial might be useful, along with the Tophat manual:

Thanks, Jen, Galaxy team

ADD COMMENTlink written 14 months ago by Jennifer Hillman Jackson23k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 85 users visited in the last hour