Question: TopHat align summary
gravatar for Diana Afonso
2.2 years ago by
Diana Afonso10
Diana Afonso10 wrote:


I am using galaxy to analyze RNA seq of 100bp sing end data, sequenced with Illumina 2500. I started by uploading bam files and using the tool "Convert from BAM to FastQ" to convert my data into FastQ format. Next I used "FastQ groomer" to convert my data into one format that contains Sanger-scaled quality values with ASCII. Finally I used TopHat with the default settings and the align summary is the following:

Reads: Input : 14520796 Mapped : 6298973 (43.4% of input) of these: 1080332 (17.2%) have multiple alignments (0 have >20) 43.4% overall read mapping rate.

I think the percentage of mapped sequences is very low. Could you please give me some tips on how I should alter the TopHat parameters to improve my results?

Thank you, Diana

ADD COMMENTlink modified 2.2 years ago by Jennifer Hillman Jackson25k • written 2.2 years ago by Diana Afonso10
gravatar for Jennifer Hillman Jackson
2.2 years ago by
United States
Jennifer Hillman Jackson25k wrote:


Double check the fastq quality score formatting. This is how:

It would be a good idea to run FastQC again if you make adjustments to the reads. That report will alert you to quality problems within the data.

After that, this RNA-seq tutorial might be useful, along with the Tophat manual:

Thanks, Jen, Galaxy team

ADD COMMENTlink written 2.2 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 172 users visited in the last hour