Question: Ensembl GTF With Correst Seqname
0
gravatar for madkisson
3.6 years ago by
madkisson30
United States
madkisson30 wrote:

Hi

Is there a version of the Ensembl GTF that has both a complete attributes column [with actual gene_id, gene_name ect. rather than just the transcript id repeated] AND the proper nomenclature in the Seqname column [ie, chr1 rather than just 1]?

rna-seq tophat galaxy • 723 views
ADD COMMENTlink modified 3.6 years ago by Jennifer Hillman Jackson25k • written 3.6 years ago by madkisson30
0
gravatar for Jennifer Hillman Jackson
3.6 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

Not that I am aware of. But you could create such a file using iGenomes content as a base. The key will be to map the Ensembl identifiers to the UCSC identifiers and then swap those into the GTF file. Just adding a "chr" to the start of identifiers works well for some chromosomes, but not all. 

I came across a git repo last week where multiple genome sources had the chromosome identifiers mapped to each other. It looks really really good/useful to me - but of course use with caution and sanity check the results. It is brand-new. I starred it and have been following the progress and updates. These are tabular files - so can be loaded and used within Galaxy easily. If you have questions about the content - the repository owner would be the best contact.
http://github.com/dpryan79/ChromosomeMappings

Tools in the group "Text manipulation" along with the tool "Join two Datasets side by side on a specified field" will do the transformation (although, you could also do this line-command, then upload to Galaxy).

Hopefully this helps! Jen, Galaxy team

 

ADD COMMENTlink written 3.6 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 178 users visited in the last hour