Question: cuffdiff conditions related question
I am using Galaxy to analyze the RNA-seq data. I want to ask one question related to the cuffdiff option "condition". I have several different species, say A, B and C. And I have three different experiment conditions of each strain, say 1, 2, 3. When I am doing the cuffdiff analysis, can I analyze all the datasets (three strains, three conditions of each strain) together? What will be the difference between strain individually analysis and strain combination analysis?

Using your case as an example, with three species (A, B, C) and three experimental conditions per species (1, 2, 3), if you ran each experimental condition of species A as a 'condition' in Cuffdiff, you would return all possible comparisons between the three samples. Your differential expression testing file would have fold change values for A.1 vs A.2, A.1 vs A.3, and A.2 vs A.3.

If you are interested in comparing all nine samples against each other, you would assign each sample as one condition and Cuffdiff will perform all possible comparisons between your nine samples.

Just a note here, the 'FPKM tracking' files will have one transcript/gene per line with FPKM information for each condition tested represented as a series of columns. The 'differential expression testing' file will have one line for each transcript/gene for each comparison. For example, if you have 10,000 transcripts in your reference and do the three way comparison (as above), the 'transcript FPKM tracking' file will have 10,000 lines and the 'transcript differential expression testing' file will have 30,000 lines.

I still have one question. If I ran each experimental condition of species A as a "condition" in cuffdiff, I would get the comparisons between three samples: (1) A1 vs A2, A1 vs A3, and A2 vs A3. But I still can run all nine samples together and I still can get the comparisons like: (2) A1 vs A2, A1 vs A3, and A2 vs A3. The thing I want to know is that what will be the difference between results (1) and (2)?

Hi Siyu, The difference between your two scenarios will only be the number of comparisons you perform. The abundance estimation and differential expression testing for each of the individual comparisons (for example: the comparison between A.1 vs A.2) will not change between your two scenarios.

