Question: Uploading Genome From Ensembl
0
gravatar for Scroggins, Sheena
7.1 years ago by
Scroggins, Sheena30 wrote:
How do I upload the Zebrafish genome from Ensembl to my user history in Galaxy? I'm trying to map my RNA-Seq data using TopHat and need to map it to the Ensembl version of ZFv9, but Galaxy only has the UCSC version built in. The Ensembl version is slightly different. Thanks, Sheena
galaxy • 1.8k views
ADD COMMENTlink modified 7.0 years ago by Jennifer Hillman Jackson25k • written 7.1 years ago by Scroggins, Sheena30
0
gravatar for Hiram Clawson
7.1 years ago by
Hiram Clawson260
Hiram Clawson260 wrote:
Can you clarify what is different between the Zv9 version of the UCSC Zebrafish and the Zv9 version of the Ensembl Zebrafish genome ? They both appear to have the identical number of 1,412,464,843 nucleotides. --Hiram To: galaxy-user@bx.psu.edu Subject: [galaxy-user] Uploading Genome from Ensembl How do I upload the Zebrafish genome from Ensembl to my user history in Galaxy? I’m trying to map my RNA-Seq data using TopHat and need to map it to the Ensembl version of ZFv9, but Galaxy only has the UCSC version built in. The Ensembl version is slightly different. Thanks, Sheena
ADD COMMENTlink written 7.1 years ago by Hiram Clawson260
0
gravatar for Jennifer Hillman Jackson
7.1 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hello Sheena, If you concern with using the UCSC version of the database has to do with the chromosome naming and downstream Cufflinks analysis using Ensembl's reference GTF files, please see #5 on our FAQ, which demonstrates how to modify an Ensembl GTF file to be compatible with the UCSC chromosome naming (slight changes may be needed for each particular genome): http://main.g2.bx.psu.edu/u/jeremy/p/transcriptome-analysis-faq If there is another reason, please know that custom reference genomes (in fasta format) can be uploaded using FTP following this method: http://wiki.g2.bx.psu.edu/Learn/Upload%20via%20FTP Hopefully this helps, Best, Jen Galaxy team -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support
ADD COMMENTlink written 7.1 years ago by Jennifer Hillman Jackson25k
0
gravatar for Jennifer Hillman Jackson
7.0 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hello Sheena, It is odd that this particular genome exists in two versions, with different content, for the same release number/date. Ensemble will update annotation with new releases, but not the reference genome itself unless they also increment the genome build number. The Zebrafish project page at Ensembl states that UCSC has the latest release, meaning that the genome labeled as "Zebrafish Jul. 2010 (Zv9/danRer7) (danRer7)" in Galaxy is expected to be the same as one would create after combining the files in their download area: http://uswest.ensembl.org/Danio_rerio/Info/Index But, if there are known differences (perhaps you want different masking or haplotypes/chrY PAR inclusion/exclusion), then combining the data can occur prior to upload into Galaxy or after (both invoke a similar tool): -- If prior, using unix and the shell "cat" command is one option. -- If after, then load all chromosomes into your history and use the tool "Text Manipulation -> Concatenate datasets tail-to-head". Best regards, Jen Galaxy team -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support
ADD COMMENTlink written 7.0 years ago by Jennifer Hillman Jackson25k
Good Afternoon Sheena: Can you please explain what is different between the Ensembl and UCSC Zebrafish Zv9 genome sequences ? --Hiram
ADD REPLYlink written 7.0 years ago by Hiram Clawson260
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 172 users visited in the last hour