2.8 years ago by
ajmaninisha30 wrote:


i am using the RNA seq pipeline for transcriptome analysis. i ahve the annotation anf gff file of the genome to be used as a refernce.My goal is to measure the rate of change of gene expression between different samples.

For this i am using the workflow : tophat, cufflinks cuffmerge and cuffdiff. 

1. while running tophat i got the alignment summary which shows that only 25% of the reads are mapped which is not a good indication and i know that mapping depends on different factors. My question is : 75% of the sequence are unmapped and may have important information and i cannot ignore that 75%. So, is anyone having any idea how to move further.


2. After tophat-- will cufflinks take only the 25% of mapped sequences as query or it will take whole of them.


3. can fpkm provide me a better idea of expression ..In some of the samples it giving me a zero value.


Kindly advise.





mapped reads tophat • 824 views
2.8 years ago by
United States
Jennifer Hillman Jackson25k wrote:


To improve read mapping, try running FastQC on the data to determine if there are quality issues. The sequence data can be then trimmed if needed. The parameters for Tophat can also be examined. If your reads are short (< 50 bases), some of the full parameter settings will need adjustment.

Cufflinks will only consider reads that are both mapped with splicing AND in a proper pair (when paired-end input is provided). 

A zero "0" value for the FPKM indicates that there was a problem. The reasons are listed in the manual - examine the field definition for an expression testing result file:

Thanks, Jen, Galaxy team

