Question: How to upload Mouse reference genome mm10, in Fasta format to My Galaxy History
0
gravatar for bioadept
8 months ago by
bioadept0
bioadept0 wrote:

I tried to use an imported "tuxedo protocol" RNA-seq pipeline from public workflows. I found mouse .gtf annotation file either from Galaxy Data Library & UCSC Main table browser. But, I could not find the mouse Reference Genome (FASTA) in the Galaxy Data Library ?

Could you tell me how to find & upload mouse mm10 & hg38 Reference genomes in Fasta Format into Galaxy History ?

I have attached snapshot of assigning RNA-seq datasets to the workflow. https://ibb.co/cYrgk6

ADD COMMENTlink modified 8 months ago by Jennifer Hillman Jackson25k • written 8 months ago by bioadept0

Are any of these files the correct fasta files to be used as reference Genome for RNA-seq analysis ? for the tuxedo pipeline mentioned in the above comment (Check image in link) (https://ibb.co/cYrgk6) ?

Gencode genome fasta file ? ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_mouse/release_M15/GRCm38.p5.genome.fa.gz

or These transcript annotation RNA fasta files ? ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_mouse/release_M15/gencode.vM15.transcripts.fa.gz UCSC http://hgdownload.cse.ucsc.edu/goldenPath/mm10/bigZips/mrna.fa.gz
NCBI ftp://ftp.ncbi.nih.gov/genomes/Mus_musculus/RNA/rna.fa.gz

or Is there any built in fasta format for human/mouse in https://usegalaxy.org ?

ADD REPLYlink modified 8 months ago • written 8 months ago by bioadept0

In this regard can you directly five FASTA GZIP (.fa.gz) file to Galaxy and then feed the data manager or the genome should be unzipped first (local galaxy)?

ADD REPLYlink written 8 months ago by vebaev130
0
gravatar for Jennifer Hillman Jackson
8 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

The workflow you are using is inputting the reference genome as a custom reference genome from the history during execution. This is one way to do the analysis. Another is to install reference genome indexes on the server you are working on (if your own or you can make requests). And the final way is to use the built-in native indexes on the server you are working on.

It looks as if you are working on Galaxy Main https://usegalaxy.org. If so, then both mm10 and hg38 are natively indexed for most tools on the server. This means that you do not need to upload the reference genome to your history. And it increases the chance of a successful job as these larger genomes can quickly use up resources building a new index each run. You will need to modify the workflow so that tools use the built-in indexes instead of a custom reference genome.

Genomes are rarely kept in data libraries at Galaxy Main - instead, they are accessed directly by the tools that they are indexed for.

RNA-seq tutorials are here: https://galaxyproject.org/learn/

How to use a Custom reference genome (and where to potentially source one, example: UCSC) is explained in the last link here. https://galaxyproject.org/support/#troubleshooting Also review the Chromosome mismatch FAQ at this location - all inputs must be based on the exact same reference genome or problems will come up with tools/results.

Hope that helps! Jen, Galaxy team

ADD COMMENTlink modified 8 months ago • written 8 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 128 users visited in the last hour