i am using the RNA seq pipeline for transcriptome analysis. i ahve the annotation anf gff file of the genome to be used as a refernce.My goal is to measure the rate of change of gene expression between different samples.
For this i am using the workflow : tophat, cufflinks cuffmerge and cuffdiff.
1. while running tophat i got the alignment summary which shows that only 25% of the reads are mapped which is not a good indication and i know that mapping depends on different factors. My question is : 75% of the sequence are unmapped and may have important information and i cannot ignore that 75%. So, is anyone having any idea how to move further.
2. After tophat-- will cufflinks take only the 25% of mapped sequences as query or it will take whole of them.
3. can fpkm provide me a better idea of expression ..In some of the samples it giving me a zero value.