Question: Please Help Me Check The Quality Of The Tophat Mapping To Reference Genome
0
gravatar for Du, Jianguang
6.3 years ago by
Du, Jianguang380
Du, Jianguang380 wrote:
Dear All, I ran "Flagstat" under "NGS: SAM Tools" to check the quality of the Tophat output (the file of accepted hits). I got the diagnosis results as follow: 9471730 + 0 in total (QC-passed reads + QC-failed reads) 0 + 0 duplicates 9471730 + 0 mapped (100.00%:-nan%) 0 + 0 paired in sequencing 0 + 0 read1 0 + 0 read2 0 + 0 properly paired (-nan%:-nan%) 0 + 0 with itself and mate mapped 0 + 0 singletons (-nan%:-nan%) 0 + 0 with mate mapped to a different chr 0 + 0 with mate mapped to a different chr (mapQ>=5) I ran Tophat with settings as shown below: Will you select a reference genome from your history or use a built-in index? Use a built-in index Select a reference genome /galaxy/data/mm9/bowtie_index/mm9 Is this library mate-paired? Single-end TopHat settings to use Full parameter list Library Type FR Unstranded Anchor length (at least 3) 8 Maximum number of mismatches that can appear in the anchor region of spliced alignment 0 The minimum intron length 70 The maximum intron length 500000 Allow indel search Yes Max insertion length. 3 Max deletion length. 3 Maximum number of alignments to be allowed 20 Minimum intron length that may be found during split-segment (default) search 50 Maximum intron length that may be found during split-segment (default) search 500000 Number of mismatches allowed in the initial read mapping 1 Number of mismatches allowed in each segment alignment for reads mapped independently 1 Minimum length of read segments 25 Use Own Junctions Yes Use Gene Annotation Model Yes Gene Model Annotations iGenome version of mm9 genes. GTF Use Raw Junctions No Only look for supplied junctions No Use Closure Search No Use Coverage Search Yes Minimum intron length that may be found during coverage search 50 Maximum intron length that may be found during coverage search 20000 Use Microexon Search No Please help me find out what is wrong with the Tophat. Thanks, Jianguang
alignment bowtie • 2.7k views
ADD COMMENTlink modified 6.3 years ago by Jennifer Hillman Jackson25k • written 6.3 years ago by Du, Jianguang380
0
gravatar for Jennifer Hillman Jackson
6.3 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hello Jianguang, This is the expected output from this particular tool. Your TopHat output file 'accepted hits' contains only mapped data. I did notice this option for the TopHat run: Your data was originally paired end - so this is unexpected. But perhaps you are working with a different dataset(s) now? If you are running with the original paired dataset, then this is would be an option to correct - change to mate paired = yes and run TopHat with both the fwd and rev reads in a single mapping process. (The same method as in the RNA-seq tutorial). http://main.g2.bx.psu.edu/u/jeremy/p/galaxy-rna-seq-analysis-exercise Best, Jen Galaxy team -- Jennifer Jackson http://galaxyproject.org
ADD COMMENTlink written 6.3 years ago by Jennifer Hillman Jackson25k
Hi Jen, Thank you very much for your information. I will not worry about the Tophat outputs now. For this particular run, I used a single-end dataset. The whole experiment contains both paired-end datasets datasets and single-end datasets. I ran Tophat with paired-end setting for the paired-end datasets, and single-end setting for the single-end datasets. And then ran Cufflink, Cuffmerge, and Cuffdiff. Jianguang ________________________________________ To: Du, Jianguang Cc: galaxy-user@lists.bx.psu.edu Subject: Re: [galaxy-user] Please help me check the quality of the Tophat mapping to reference genome Hello Jianguang, This is the expected output from this particular tool. Your TopHat output file 'accepted hits' contains only mapped data. I did notice this option for the TopHat run: Your data was originally paired end - so this is unexpected. But perhaps you are working with a different dataset(s) now? If you are running with the original paired dataset, then this is would be an option to correct - change to mate paired = yes and run TopHat with both the fwd and rev reads in a single mapping process. (The same method as in the RNA-seq tutorial). http://main.g2.bx.psu.edu/u/jeremy/p/galaxy-rna-seq-analysis-exercise Best, Jen Galaxy team -- Jennifer Jackson http://galaxyproject.org
ADD REPLYlink written 6.3 years ago by Du, Jianguang380
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 175 users visited in the last hour