Question: iGenomes UCSC hg38 annotation gtf file
14 months ago by
mk146090 wrote:

Dear community,

I have been analysing rna-seq data using Tuxedo on galaxy. I download the iGenomes UCSC hg38 reference annotation .tar.gz file (14.9GB). Extracted the folder onto my computer and followed the path: Homo_sapiens_UCSC_hg38\Homo_sapiens\UCSC\hg38\Annotation\Archives\archive-2015-08-14-08-18-15 Here there are 2 folders (Genes and Genes.gencode) both with a genes.gtf file (148Mb file in genes folder and a 1.333Gb in the Genes.gencode file). And now I am uncertain as to which one to use. Using either as the annotation file through the rna-seq analysis gives slightly differing DEGs and I don't know which is a better representation of my results. If anyone could help it would be very much appreciated!

14 months ago by
United States
Jennifer Hillman Jackson24k wrote:


"Genes" is associated with RefSeq and "Genes.gencode" is associated with Gencode. Both are valid reference annotation sources but will have differences between them.

One is not necessarily better than the other and which to use is your choice. Perhaps reviewing the methods for how each is created will help in making the decision?

Thanks, Jen, Galaxy team

