Question: Custom reference genome for RNA-seq
0
gravatar for aberry2
3.6 years ago by
aberry20
United States
aberry20 wrote:

I am beginning to analyze some RNA-Seq data and having some difficulties with the custom reference genome.  My reference genome is the goat (Capra hircus).  On NCBI, I can download a fasta file for each chromosome but do not see an option to download just one fasta file of the genome, which is how was interpreting it done from the wiki custom reference genome page.  Do I have to run for each chromosome individually?  

ADD COMMENTlink modified 3.6 years ago by Jennifer Hillman Jackson25k • written 3.6 years ago by aberry20
1
gravatar for Jennifer Hillman Jackson
3.6 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

Most data providers include a bundled file that contains all chromosomes. This is usually in the same list as the individual chromosomes, but named slightly different and by version. First double check that is not true.

If you do by chance need to load individual chromosomes, do that first. Then use the tool "Text manipulation -> Concatenate datasets" to merge all into one file. 

Once merged, perform some QA. Specifically, double check that the new fasta dataset (assign the datatype using the pencil icon if needed), does not contain extra spaces, and is wrapped. You want the data in strict fasta format. The Troubleshooting portion of the second wiki link contains instructions for manipulating fasta data to achieve that format.

General help:
http://wiki.galaxyproject.org/Support#Custom_reference_genome

Specific, including Troubleshooting (section 8):
http://wiki.galaxyproject.org/Learn/CustomGenomes

Take care, Jen, Galaxy team

ADD COMMENTlink written 3.6 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 118 users visited in the last hour