3.5 years ago by
United States
Hi Nisha,
In short, check out the differential expression Cuffdiff outputs. Conditions will have statistics to sort out the transcripts that are significantly associated with one condition versus another. Filter by condition. Then use the tracking outputs to locate the Cuffmerge coordinates for transcript and extract the fasta sequence. There are other methods - such as pulling out all the transcripts first, then filtering by conditions by identifier matches in the tracking files, but I have found this way to be quicker (fewer steps).
More detail: Each transcript will have some mix of reads from the conditions - specifically for the regions of the transcript in common between the conditions. This will be the case for most of the data. The exceptions will be cases where a transcript (or possibly gene) is only expressed in a particular condition (specifically - one or more expressed transcripts that do not share enough exon sequence meeting the paired-end mapping criteria set in earlier steps). Reads from conditions map to transcripts in a many-to-many relationship.
Cufflinks does not create "consensus" sequences - the output is just where in the genome these are located (coordinates). If you extract fasta sequence from the genomic based on those coordinates, splicing differences will be represented, which is the important part (minor changes in read content that did not impact slices should not be a factor for differential expression analysis).
Use tools like "Filter" and/or "Select" to separate transcripts by condition in CuffDiff output. Then tools in the group "Join, Subtract and Group", such as "Join two datasets" will help when tracing back from tracking files between the output files from the different outputs. Once you have the targeted Cufflinks coordinates grouped by condition, do the fasta extraction using the tool "Extract Genomic DNA". Those fasta transcripts can then be run through BLAST+ and mapped to GO terms.
For more about the output files from any of the tools in the Tuxedo suite, the best resource is here: http://cole-trapnell-lab.github.io/cufflinks/manual/
Apologies for delayed reply - nearly all of our team was in travel last week and a few questions slipped through. But, hopefully this helps to get your analysis going! Best, Jen, Galaxy team