Question: Cufflinks and gene name
gravatar for ctoscano
3.1 years ago by
United States
ctoscano0 wrote:

Hellos everyone,

I know there's many posts regarding this question but I still have found myself unable to solve this problem. After processing with TopHat (reference genome mm10) and cufflinks with standard options, I try to use cuffmerge and cuffdiff to get FPKM values. I've used many different GTF files as a reference in cuffmerge, but I'm having a really hard time trying to get the gene names from there. What should I do? 


galaxy cufflinks • 1.2k views
ADD COMMENTlink modified 3.1 years ago by Jennifer Hillman Jackson25k • written 3.1 years ago by ctoscano0
gravatar for Jennifer Hillman Jackson
3.1 years ago by
United States
Jennifer Hillman Jackson25k wrote:


The most direct way is to use a GTF reference annotation file that contains the gene_name attribute in the 9th field. There is one for mm10 from iGenomes. Download the tar file locally, unpack it, and then upload the genes.gtf dataset to Galaxy for use with these tools.

Another method is mapping the identifiers contained in the files (supplied by the reference annotation used) to a gene identifier (name or symbol), then linking that in. A common way to do this using a two column file of the mappings along with the "Join, Subtract and Group > Join" tool. UCSC and Biomart will have mm10 annotation mapping transcript identifiers to gene identifiers.

A final method that could be used is examining the overlapping coordinates of transcripts in the result files with a tabular file that contains a gene identifier along with coordinates. This is a less precise method, but is a way to capture some annotation if there are no other options. There are many tools that compare the coordinates of datasets - query by "interval" in the tool search to find the most commonly used. 

There are other detailed replies here in Biostar, but it sounds like you have already reviewed those so I won't link them in.

Hopefully this helps! Jen, Galaxy team

ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by Jennifer Hillman Jackson25k

Thank you Jen, I will try first with the GTF from iGenomes.

ADD REPLYlink written 3.1 years ago by ctoscano0

I dowloaded the mm10 form iGenomes, used de UCSC version. I don't know wich GTF file to use, there's 4 folders with different dates. Should I use the latest? 


ADD REPLYlink written 3.1 years ago by ctoscano0

The latest would have the most current data. Give that a try if you have not already. Jen

ADD REPLYlink written 3.1 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 169 users visited in the last hour