Question: Evaluating Tophat'S Results
0
gravatar for Hoang, Thanh
5.3 years ago by
Hoang, Thanh200
Hoang, Thanh200 wrote:
Hi, I ran TopHat on Galaxy for my RNA-seq data. I want to analyze TopHat's output files, such as percentage of reads mapped to the genome...but I am not sure how to do that. I am also trying to visualize the BAM file by IGB but the following error message appears : " Failed to authenticate to the server". Anyone can help with these issues? Thank so much Thanh
galaxy • 1.5k views
ADD COMMENTlink modified 5.3 years ago by Jennifer Hillman Jackson25k • written 5.3 years ago by Hoang, Thanh200
0
gravatar for Jennifer Hillman Jackson
5.3 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hello Thanh, The tool " NGS: Picard (beta) -> SAM/BAM Alignment Summary Metrics" may be the tool you are looking for. There are others in this tool group that added up numbers in BAM or SAM files, and SAMTools has "flagstat", so you could create you own calculation with one of those, plus a count on the fastq inputs, and the "Compute" tool, if it is not exactly right. Are you using the public Main Galaxy instance at https://main.g2.bx.psu.edu/ usegalaxy.org) clicking over to connect to the Genomic HyperBrowser, via web? Or are you doing something else? Can you give this another try this morning and see if it is working? Hopefully the first part helped, let us know about the second, Take care, Jen Galaxy team -- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org
ADD COMMENTlink written 5.3 years ago by Jennifer Hillman Jackson25k
Hi Jen, Thanks so much for your advice. *The tool " NGS: Picard (beta) -> SAM/BAM Alignment Summary Metrics" may be the tool you are looking for. There are others in this tool group that added up numbers in BAM or SAM files, and SAMTools has "flagstat", so you could create you own calculation with one of those, plus a count on the fastq inputs, and the "Compute" tool, if it is not exactly right.* * * I ran the Flagstat on my TopHat 's output BAM file. I am now confusing about the result: 44066574 + 0 in total (QC-passed reads + QC-failed reads) 0 + 0 duplicates 44066574 + 0 mapped (100.00%:-nan%) 0 + 0 paired in sequencing 0 + 0 read1 0 + 0 read2 0 + 0 properly paired (-nan%:-nan%) 0 + 0 with itself and mate mapped 0 + 0 singletons (-nan%:-nan%) 0 + 0 with mate mapped to a different chr 0 + 0 with mate mapped to a different chr (mapQ>=5) Note that my RNA-seq data is single-ended sequencing. The raw data for this sample before mapping with TopHat only has 33174286 reads. My question is why I have more reads mapped in the BAM file from TopHat's output? and Does this BAM file contains the mapped reads only ( NOT non-mapped reads)? I have also tried the *SAM/BAM Alignment Summary Metrics *tool. This time I have 25541681 reads from BAM file ( the result seems only show mapped reads). Is that the number I should expect? Thank you Thanh
ADD REPLYlink written 5.3 years ago by Hoang, Thanh200
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 96 users visited in the last hour