Question: Custom reference genome for RNA-seq
3.8 years ago by
United States
aberry20 wrote:

I am beginning to analyze some RNA-Seq data and having some difficulties with the custom reference genome.  My reference genome is the goat (Capra hircus).  On NCBI, I can download a fasta file for each chromosome but do not see an option to download just one fasta file of the genome, which is how was interpreting it done from the wiki custom reference genome page.  Do I have to run for each chromosome individually?  

written 3.8 years ago by aberry20
3.8 years ago by
United States
Jennifer Hillman Jackson25k wrote:


Most data providers include a bundled file that contains all chromosomes. This is usually in the same list as the individual chromosomes, but named slightly different and by version. First double check that is not true.

If you do by chance need to load individual chromosomes, do that first. Then use the tool "Text manipulation -> Concatenate datasets" to merge all into one file. 

Once merged, perform some QA. Specifically, double check that the new fasta dataset (assign the datatype using the pencil icon if needed), does not contain extra spaces, and is wrapped. You want the data in strict fasta format. The Troubleshooting portion of the second wiki link contains instructions for manipulating fasta data to achieve that format.

General help:

Specific, including Troubleshooting (section 8):

Take care, Jen, Galaxy team

written 3.8 years ago by Jennifer Hillman Jackson25k
