3.6 years ago by
United States
Hello,
For the second question about reference annotation, correct formatting matters, but more likely there is some mismatch between the reference genome and the reference annotation used. Often this has to do with the datasets being based on different versions of a genome build, specifically the chromosome identifiers are a mismatch. Help to determine if this is the case is in the Galaxy wiki here:
http://wiki.galaxyproject.org/Support -> section 2.11
If that is not the problem, then the problem could be with content in the "attributes" field of the GFF3/GTF dataset. This tool package reports full statistics only when the annotation provides certain content. Illumina's iGenomes collection has one of the best sets of annotation files to use (these contain a complete attribute set).
To review the content used, please see the Tuxedo pipeline documentation at:
http://cole-trapnell-lab.github.io/cufflinks/manual/
In particular, review the inputs for Cuffdiff here:
http://cole-trapnell-lab.github.io/cufflinks/cuffdiff/index.html#cuffdiff-input-files
iGenomes annotation data can be found here. Download, unzip the file, and then upload just the "genes.gtf" file to Galaxy for use:
http://support.illumina.com/sequencing/sequencing_software/igenome.html
For your second question about the translated report, I personally do not know the answer. The issue may go away with the proper annotation dataset, so try that first. If there is still a problem, this is a good question for the tool's support group (contact info is at the same web site as the manual, linked above).
Take care, Jen, Galaxy team