Hi,
I am wondering what is the exact difference between the gene list of the transcript differential expression testing file and the gene differential expression testing file generated by cuffdiff?
The reason I ask this is because I found a lot of genes were not tested in the transcript differential expression testing, while these genes were found to be significant in the gene differential expression testing file, so i got a bit confused of which one is more appropriate to use?
Thanks a lot for answering me!
Kind regards, Ken
Hi Ken,
First a bit of nomenclature related to Cuffdiff. Transcripts (isoforms) are denoted with a 'TCONS_' prefix. Genes are denoted with the 'XLOC_' prefix. A gene is comprised of the collective transcripts/isoforms from a given locus. If you look at your 'transcript differential expression testing' file, you'll see that many 'TCONS_' identifiers (column 1) share 'XLOC_' identifiers (column 2).
Cuffdiff will quantify a FPKM value for each transcript you provide in your reference file (GTF). The FPKM values in the 'transcript differential expression testing' file represent the FPKM of each transcript you provided in the reference (in each condition). The FPKM values in the 'gene differential expression testing' file represent the summed FPKM values of the transcripts (isoforms) contributing to each gene (in each condition).
You may see that some transcripts do not have FPKM values or have a 'NO TEST' status because those isoforms may not be expressed. You can check by evaluating the aligned reads (or absence of) over unique regions of the isoforms using Trackster.
Cheers,
Mo Heydarian