19 months ago by
United States
Hello,
The mm10 reference genome native to Galaxy is the correct one to use for mapping (Tophat or HISAT). Use that same reference genome with downstream tools that require a genome to be specified.
To obtain annotated genes you will want to upload a reference annotation dataset and use that with CuffMerge (along with Cufflinks GTFs), to produce a complete GTF of all transcripts (novel and known). Then use that GTF produced by CuffMerge as the combined reference annotation input dataset with Cuffdiff.
Alternatively, you can skip running Cufflinks and CuffMerge and instead use the reference annotation (GTF) directly from the source (iGenomes is best) with Cuffdiff.
There are even more choices about when to use a reference annotation dataset - during the mapping step and/or during the transcript assembly. Each of these workflow options produces slightly different results, depending on your goals: discovery of novel transcripts plus differential expression versus known transcripts (only) plus differential expression.
Please see this prior Q&A for where to get the best version of a reference annotation dataset for mm10: https://biostar.usegalaxy.org/p/21827/#21845
For more on how the complete process is run, including a description of the alternatives, the manual and tutorials here have example usage:
We hope this helps! Jen, Galaxy team