Question: Add Missing Attributes to GTF
0
gravatar for rb
2.2 years ago by
rb0
rb0 wrote:

I'm working with the ferret genome (MusPutFur1.0.69) and I'm using the Ensembl GTF file. However, some of the genes do not have annotated gene names. I have gene_id's for all genes, but some are missing the gene_name attribute.

I'm using a pipeline that requires gene_names in my GTF file for all genes, so to solve this I want to use the gene_id as gene_name. Is there a simple way of parsing the GTF and adding the id as the name for all rows missing a name?

Thank you.

rna-seq gtf ferret • 919 views
ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by rb0
2
gravatar for Jennifer Hillman Jackson
2.2 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

There isn't a specific tool in Galaxy to do this in one step.

You could alter the data line command. Or, some general line-command tools are also in Galaxy. Search by tool name: Select (regular expression search), Awk, Sed, others like Cut, Paste, Concatenate (cat). These are good when you know already know how to use them or are willing to read online help about usage/syntax (the tool forms have some, but many sites offer Q&A about tools with examples).

I think for the simplest implementation, one idea would be to separate the file into two datasets - one that has the gene_name already and the other that does not. Then manipulate the one without. When finished, merge the two parts together again.

Best, Jen, Galaxy team

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by Jennifer Hillman Jackson25k
0
gravatar for rb
2.2 years ago by
rb0
rb0 wrote:

Thank you for the advice! Does the order of the attributes matter? So if I add gene names to the missing ones at the end and the ones that already had gene names have them in the middle, is this usually a problem or is this software specific?

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by rb0
1

Technically, the order of values shouldn't matter in a GTF attributes field, but not all software behaves as expected. I suggest testing it out to see.

ADD REPLYlink written 2.1 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 168 users visited in the last hour