Hello,
I am trying to identify differentially expressed genes across three groups. The groups consist of RNA-seq data from a cell line that was treated with protein A, protein B, or untreated. Each group contains three samples. I mapped the samples individually to reference genome hg19 using tophat. I then attempted to use cuffdiff to identify differentially expressed genes using the igenomes hg19 genes file as a reference notation and 3 concatenated files containing the mapped BAM files for the 3 samples per group as 3 distinct conditions (treated A, treated B, untreated). I did not run the cufflinks assembler or cuffmerge on the data because I am not concerned with novel genes, just known and differentially expressed genes. The two proteins are known to cause cancerous-like mutations and this should be evident from differentially expressed genes. The cuffdiff output of files containing information about differentially expressed genes only compared 2 of the three groups. The cuffdiff output of files containing information about FPKM values compared all three of the groups. Is it necessary to run cuffdiff 3 separate times for this analysis? The resultant data from 3 separate runs of cuffdiff is much more difficult to analyze. Any suggestions would be greatly appreciated.
Drew