Question: Human annotation GFF suitable for Galaxy Cufflinks (hg38)?
0
gravatar for vebaev
2.5 years ago by
vebaev130
Bulgaria
vebaev130 wrote:

Hi,

I want to run a tophat (on hg38) and than cufflinks. Cufflinks asks me for human annotation (probably in GFF), do anyody knows where I can get a file suitable for Galaxy? I came across some GFFs but they have different abbreviations for the chromosomes and do not know which one Galaxy uses?

gff human annotation cufflinks • 1.9k views
ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by vebaev130
0
gravatar for Jennifer Hillman Jackson
2.5 years ago by
United States
Jennifer Hillman Jackson23k wrote:

Hello,

iGenomes has the best reference annotation for this tool, but hg38 is not available yet. I personally do not know the ETA for hg38, but you could ask them.
http://support.illumina.com/sequencing/sequencing_software/igenome.html

If you really want to use hg38, some things to check for in the reference annotation for optimal use of the available functions in Cufflinks (and more importantly, for Cuffdiff):

  1. chromosome identifiers are an exact match between all inputs (ref genome, ref annotation, any other external resources)
  2. gene_id and transcript_id have distinct values in the attributes field
  3. p_id and tss_id exist in the attributes field
  4. optionally, gene_name exists in the attributes field
  5. no duplicated "ID" values between distinct entries in the attributes field

If you locate a good file, please consider sharing it on http://usegalaxy.org. You can also write back here and we can consider including it in a shared Data Library for everyone to access directly.

Best, Jen, Galaxy team

ADD COMMENTlink written 2.5 years ago by Jennifer Hillman Jackson23k

 

I get one (named Basic gene annotation by CHR) from - http://www.gencodegenes.org/releases/current.html

Can you tell if it is good as I already did my Tophat mapping, and if not I should pause working on it and go back searching another?

OOPS...sorry cross posting I post also an new reply rather comment to you.

ADD REPLYlink written 2.5 years ago by vebaev130
0
gravatar for vebaev
2.5 years ago by
vebaev130
Bulgaria
vebaev130 wrote:

I get one (named Basic gene annotation by CHR) from - http://www.gencodegenes.org/releases/current.html

Can you tell if it is good as I already did my Tophat mapping, and if not I should pause working on it and go back searching another?

ADD COMMENTlink written 2.5 years ago by vebaev130

Check the items I listed out. Most of these come from the Cuffdiff documentation available here: http://cole-trapnell-lab.github.io/cufflinks/cuffdiff/index.html#cuffdiff-input-files

ADD REPLYlink written 2.5 years ago by Jennifer Hillman Jackson23k

Thanks, I will go back to hg19!

(is there link only with the GTF hg19 from iGenomes, I started downloading but it is a whole package Homo_sapiens_UCSC_hg19.tar.gz and it will takes too long waiting only for the GTF file)

ADD REPLYlink written 2.5 years ago by vebaev130
1

The dataset is also available on http://usegalaxy.org under Shared Data -> Data Libraries -> iGenomes. The full data that you are downloading will contain other ancillary files (which may be useful!). The file "genes.gtf" is the reference annotation to use with the tools directly.

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by Jennifer Hillman Jackson23k

Update: I took a look at the file. Looks like it will work, but just be aware that the full complement of statistics from Cuffdiff (if you plan on using that tool) will not be generated without the additional attributes. Thanks! Jen

ADD REPLYlink written 2.5 years ago by Jennifer Hillman Jackson23k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 71 users visited in the last hour