Question: Importing Gtf Into Galaxy
0
David Martin • 50 wrote:
Hi,
I have been trying to get reference data from the ucsc browser into
Galaxy, but when I try to get the rat genome in gtf format, I only get
files halfway through chomosome 10. This happens with both of the
available builds for the rat. I am guessing this is a problem with
the UCSC files and not Galaxy. However, I was wondering if you could
perhaps help with this issue. When I try to get just chromosome 10,
the gtf file halts in the same place as it does when I try to get the
whole genome, with this message at the bottom of the file:
chr10 rn4_refGene start_codon 4293086 4293088 0.000000
+ . gene_id "NM_001008876"; transcript_id
"NM_001008876_dup1";
offsetToGenomic: need previous exon, but given index of 0
Any ideas? I have no problem importing the "KnownGenes" option in
UCSC - would this file work well as a reference - what is the
difference in the different files from UCSC(ie. "gene and gene
prediction tracks" vs "mRNA and EST tracks" or "KnownGenes" vs
"RefSeq)? I guess I could try to download each chromosome separately
to avoid that line in chromosome 10 and then concatenate them. I
tried downloading the rat GTF from Ensembl, but when I brought it into
Galaxy, it wasn't formatted properly, and didn't have the NM_0001
accession numbers associated with it but rather some other type of
label, so it looks like it might need some grooming before use with
the NGS suite.
Thank you for any suggestions,
David Martin
dmarti@lsuhsc.edu
ADD COMMENT
• link
•
modified 7.8 years ago
by
Jennifer Hillman Jackson ♦ 25k
•
written
7.8 years ago by
David Martin • 50