Question: Galaxy Unable To Set Metadata For Gff Files
gravatar for Hemarajata, Peera
6.7 years ago by
Hemarajata, Peera40 wrote:
Dear all, I'm been trying to get Galaxy to recognize this GFF from NCBI ( ftp:// 875/NC_010609.gff) but it failed to recognize the format after I uploaded it. Manual setting didn't work either because it gave me a "unable to set metadata" error to me as soon as I started a cufflinks run using that GFF. I have tried to reformat the file several times and even tried using the popular script to re-parse the records from the original genbank file. Would anyone kindly look at the NCBI GFF and guide me to a solution to get this file recognized by Galaxy? I've been stuck for a couple of weeks now and would appreciate some suggestions. Thank you! Sincerely yours, Peera Hemarajata, M.D. Advanced graduate student - Versalovic lab Department of Molecular Virology and Microbiology - Baylor College of Medicine Department of Pathology - Texas Children's Hospital Suite 830, 8th Floor Feigin Center. Tel: 832-824-8245
gff cufflinks rna-seq • 1.0k views
ADD COMMENTlink modified 6.7 years ago by Jennifer Hillman Jackson25k • written 6.7 years ago by Hemarajata, Peera40
gravatar for Peter Cock
6.7 years ago by
Peter Cock1.4k
European Union
Peter Cock1.4k wrote:
That *could* be because the NCBI's GFF3 is still horrible broken, but they are working on it and the next release should have valid GFF which I am looking forward to. broken.html However, if you get similar problems with a GFF3 file converted from GenBank using BioPerl, then I guess it is a Galaxy issue. Peter
ADD COMMENTlink written 6.7 years ago by Peter Cock1.4k
gravatar for Jennifer Hillman Jackson
6.7 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hi Peera, I downloaded the file, stripped off extra comment lines (extra two at top starting with "#!" and one at bottom "##"). I loaded this to Galaxy as text, and when I attempted to set datatype as GFF3 ran into the metadata issues. This links at GMOD have a GFF3 format specification: Bringing the data into spec will be the only solution if you want to use it. While simple format errors could be corrected by working with the file in tabular format in Galaxy, more complex errors will likely need to be fixed before upload into Galaxy. The GMOD validation tool can help pinpoint the errors. Enter the ftp URL into the form. When I ran, the errors seem to be with the "type" keywords used (do not meet spec): Line Number Error/Warning 4 [WARNING] unknown directive (directive: ##Type DNA NC_010609.1) 5 [ERROR] invalid type (type: source) 10 [ERROR] invalid type (type: misc_feature) 11 [ERROR] invalid type (type: misc_feature) 12 [ERROR] invalid type (type: misc_feature) 13 [ERROR] invalid type (type: misc_feature) 14 [ERROR] invalid type (type: misc_feature) 15 [ERROR] invalid type (type: misc_feature) 16 [ERROR] invalid type (type: misc_feature) 17 [ERROR] invalid type (type: misc_feature) ... 158 pages of errors... If you have a history with a GFF3 file from the bioperl program (the one you used and Peter suggested) that you believe to produce a file in spec (does not have the above content/errors) and verified by passing the above validation test, and is still giving errors with Cufflinks, there could be another problem. A chromosome naming mismatch between the reference genome and reference annotation is a common problem that you can examined first (all chromosome identifiers between BAM/SAM results, GTF/GFF3 annotation, and the reference genome must be identical). If that checks out, then please send a bug report from that failed Cufflinks job (green bug icon) and note in the comments that that bug report is from you, if your Galaxy account has a different email address than the one used for this email. We can help rule out other types of problems that are common with this tool set. Hopefully this helps, but if not, we can work with your bug report, Best, Jen Galaxy team -- Jennifer Jackson
ADD COMMENTlink written 6.7 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 166 users visited in the last hour