Question: Other reference genomes not listed
1
gravatar for carlos.cordon
4.0 years ago by
Spain
carlos.cordon10 wrote:

Hello,

I really like Galaxy tool but I have the problem that it does not contain the reference genome of the organism I work with (Trypanosoma brucei). I would like to know if there is any way to load a reference genome not listed in the website. Thanks in advance

Carlos

customgenomes bowtie genomes • 913 views
ADD COMMENTlink modified 4.0 years ago • written 4.0 years ago by carlos.cordon10
1
gravatar for Daniel Blankenberg
4.0 years ago by
Daniel Blankenberg ♦♦ 1.7k
United States
Daniel Blankenberg ♦♦ 1.7k wrote:

Hi Carlos,

Most Galaxy tools can use a reference genome in the form of a fasta file uploaded to your history. See https://wiki.galaxyproject.org/Learn/CustomGenomes for more details.

Thanks for using Galaxy,

Dan

ADD COMMENTlink written 4.0 years ago by Daniel Blankenberg ♦♦ 1.7k
0
gravatar for carlos.cordon
4.0 years ago by
Spain
carlos.cordon10 wrote:

 

Hi Dan,

thanks for your answer. I uploaded the fasta version of my reference desired genome, but still the website tell me that there is not a current genome version of the required format. The fasta version I got it is not probably sorted (it is only the plain text of all bases without sorting it in chromosomes) but I don't know how to sort it properly. Any suggestion?

Thanks

Carlos

 

ADD COMMENTlink written 4.0 years ago by carlos.cordon10

Hello,

Double check that your dataset is assigned to the datatype "fasta". Also confirm that it is in strict fasta format - there is troubleshooting help in the summary wiki above, and even more in the video and linked-out dedicated wiki page.

Sorting is optional for many tools/pipelines, but if you want to do this or need to (using GATK), convert fasta -> tabular, then use 'Sort', then convert back to fasta. All the tools to do this are in Galaxy: see the groups "Filter and Sort" and "Fasta Manipulation".

You may need to slice the tabular file up (then put it back together) in order to sort properly for GATK, but those tools are also in Galaxy. See the group 'Text Manipulation'. And see the GATK website for how the sorting is expected to be. In short, it is in numerical order, as in the list below. For genomes with alternate chromosome name styles, this can always be a bit confusing, but try something similar, or use the sort order of the reference annotation data you are using (that is known to be GATK compatible).

chr1
chr2
chr3
...
chr21
chr22
chrX
chrY
chrM

Best, Jen, Galaxy team

ADD REPLYlink written 4.0 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 168 users visited in the last hour