12 months ago by
United States
Hello,
The workflow you are using is inputting the reference genome as a custom reference genome from the history during execution. This is one way to do the analysis. Another is to install reference genome indexes on the server you are working on (if your own or you can make requests). And the final way is to use the built-in native indexes on the server you are working on.
It looks as if you are working on Galaxy Main https://usegalaxy.org. If so, then both mm10
and hg38
are natively indexed for most tools on the server. This means that you do not need to upload the reference genome to your history. And it increases the chance of a successful job as these larger genomes can quickly use up resources building a new index each run. You will need to modify the workflow so that tools use the built-in indexes instead of a custom reference genome.
Genomes are rarely kept in data libraries at Galaxy Main - instead, they are accessed directly by the tools that they are indexed for.
RNA-seq tutorials are here: https://galaxyproject.org/learn/
How to use a Custom reference genome (and where to potentially source one, example: UCSC) is explained in the last link here. https://galaxyproject.org/support/#troubleshooting Also review the Chromosome mismatch FAQ at this location - all inputs must be based on the exact same reference genome or problems will come up with tools/results.
Hope that helps! Jen, Galaxy team
Are any of these files the correct fasta files to be used as reference Genome for RNA-seq analysis ? for the tuxedo pipeline mentioned in the above comment (Check image in link) (https://ibb.co/cYrgk6) ?
Gencode genome fasta file ? ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_mouse/release_M15/GRCm38.p5.genome.fa.gz
or These transcript annotation RNA fasta files ? ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_mouse/release_M15/gencode.vM15.transcripts.fa.gz UCSC http://hgdownload.cse.ucsc.edu/goldenPath/mm10/bigZips/mrna.fa.gz
NCBI ftp://ftp.ncbi.nih.gov/genomes/Mus_musculus/RNA/rna.fa.gz
or Is there any built in fasta format for human/mouse in https://usegalaxy.org ?
In this regard can you directly five FASTA GZIP (.fa.gz) file to Galaxy and then feed the data manager or the genome should be unzipped first (local galaxy)?