Question: Adding a reference genome to local Galaxy
1
gravatar for ekob20
16 months ago by
ekob2010
United Kingdom
ekob2010 wrote:

Hello

I am using a local instance of Galaxy to call variants among six sets of reads from mouse strains that I aligned to the mm10 reference genome using Bowtie2. The alignments were done on the main Galaxy server, but I have switched to a local Galaxy instance because the files exceed the maximum space allowed on this server. The problem I am having is that the mm10 reference genome is not available in the local Galaxy instance, and so the programs I need do not recognise the database for the aligned read files.

I have tried to follow the instructions given here: https://wiki.galaxyproject.org/Admin/UseGalaxyRsync, however they seem not to have worked (I still can't see mm10 in the list of options when I try to update the database for a sample).

I would appreciate any further guidance on how to import a reference genome from the main Galaxy to my local instance.

Kind Regards,

Eleanor

 

 

ADD COMMENTlink modified 25 days ago by qaedi.650 • written 16 months ago by ekob2010
0
gravatar for Hotz, Hans-Rudolf
16 months ago by
Switzerland
Hotz, Hans-Rudolf1.5k wrote:

Hi Eleanor

Have you added mm10 to ~/tool-data/shared/ucsc/builds.txt and restarted the server ?

 

Regards, Hans-Rudolf

 

 

ADD COMMENTlink written 16 months ago by Hotz, Hans-Rudolf1.5k
0
gravatar for ekob20
16 months ago by
ekob2010
United Kingdom
ekob2010 wrote:

Hi Hans-Rudolph

Thanks for your reply.  No, I tried installing the 'Rsync with g2' tool because I had the impression this would allow me to import data directly from the Main Galaxy.  This tool seems to have installed correctly but I'm unsure what I need to do next...

When you suggest adding mm10, do you mean I would download an mm10 file first and then add it? I was keen to import the exact data from Galaxy Main if possible because I have already run the alignments, so would like to be able to access the same version of the reference genome as the reads were originally aligned to.

Many thanks

Eleanor

 

 

ADD COMMENTlink written 16 months ago by ekob2010
0
gravatar for Hotz, Hans-Rudolf
16 months ago by
Switzerland
Hotz, Hans-Rudolf1.5k wrote:

looks like we are talking about two different things...I thought, you could not change the 'database' setting (i.e the "dbkey") to 'mm10' for your history item, do you?  The "builds.txt" file is a list of all potential genomes you can use in galaxy (independent of a tool, but required to set the "dbkey" correctly. For the mouse genomes, our "builds.txt" file looks like this:

[]$ grep mm builds.txt
mmtv    MMTV Nov. 2009 (MMTV/mmtv) (mmtv)
mm10    Mouse Dec. 2011 (GRCm38/mm10) (mm10)
mm9    Mouse July 2007 (NCBI37/mm9) (mm9)
mm8    Mouse Feb. 2006 (NCBI36/mm8) (mm8)
mm7    Mouse Aug. 2005 (NCBI35/mm7) (mm7)
[]$

 

Hans-Rudolf

 

 

ADD COMMENTlink modified 16 months ago • written 16 months ago by Hotz, Hans-Rudolf1.5k
0
gravatar for ekob20
16 months ago by
ekob2010
United Kingdom
ekob2010 wrote:

The 'database' setting for my aligned read files (sorted BAM files) was mm10 when I ran the alignments in Galaxy main. However, when I upload those files to my local Galaxy, the database comes up as '?'.  I was assuming this was because the mm10 reference genome was not installed and so it didn't recognise the database. I had therefore been hoping that if I could make mm10 available in my local Galaxy, I could change the database for my BAM files. Is this not the case?

ADD COMMENTlink written 16 months ago by ekob2010
0
gravatar for Hotz, Hans-Rudolf
16 months ago by
Switzerland
Hotz, Hans-Rudolf1.5k wrote:

As I wrote in my previous post, the database setting is independent of the tool. It could refer to the reference genome (ie the fasta file), to a bowtie index, or any other index, to a 2bit file, a file containing the chromosome lengths, etc.

When you upload data you have to set the dbkey (database setting) manually. If the option 'mm10' is not available, you have to add it to the "builds.txt" file

Once, the dbkey is set, it depends on the next tool you wanna use, whether you need the reference genome or an index file

 

Hans-Rudolf

ADD COMMENTlink written 16 months ago by Hotz, Hans-Rudolf1.5k
0
gravatar for ekob20
16 months ago by
ekob2010
United Kingdom
ekob2010 wrote:

OK, that makes sense. I guess my question then is how do I add an option to the 'builds.txt' file?

Thank you very much for all your help. 

ADD COMMENTlink written 16 months ago by ekob2010
0
gravatar for Hotz, Hans-Rudolf
16 months ago by
Switzerland
Hotz, Hans-Rudolf1.5k wrote:

you open the file with your favourite text editor, and add the extra line, e.g,

mm10    Mouse Dec. 2011 (GRCm38/mm10) (mm10)

I recommend not to add the line at the end of the file, but above or below the entry for mm9

 

 

 

ADD COMMENTlink written 16 months ago by Hotz, Hans-Rudolf1.5k
0
gravatar for Jennifer Hillman Jackson
16 months ago by
United States
Jennifer Hillman Jackson21k wrote:

Hello,

First, great advice Hans-Rudolf!

Second, we'd like to ensure that the Data Manager data_manager_rsync_g2 is functioning optimally for all users as a one-step process for loading up new genomes/data (regardless of line-command skills). The important part is that it should not require manual config/reference file manipulations. But it sounds as if it is.

The tool form's dbkey selection is extracted from the rsync server (as far as I know). But perhaps it should also have a section concerning creating a new dbkey or selecting from existing dbkeys to associate with the new data.

Examples of this type of dbkey association (if exists) and creation of a new one (if does not exist) can be found in these Data Managers: 
data_manager_fetch_genome_all_fasta
data_manager_fetch_genome_dbkeys_all_fasta

Additional tool function request needed (to enhance usability)? Are there other factors in play? Thoughts?

An issue can be opened at https://github.com/galaxyproject/tools-devteam by any, if warenteed. 

Thanks, Jen, Galaxy team

 

 

ADD COMMENTlink written 16 months ago by Jennifer Hillman Jackson21k
0
gravatar for ekob20
16 months ago by
ekob2010
United Kingdom
ekob2010 wrote:

Hi Jen

Thanks for your reply.  It seems like data_manager_rsync_g2 is not functioning the way it sounds like it should. I have installed this and restarted Galaxy but this doesn't seem to have updated options for dbkey selection. Am I to understand that built-in indexes for tools should now include the same options as are available in the Main Galaxy? Currently all tools are still saying 'no options available' when I select 'use a built-in genome index'.

Thanks, Eleanor

ADD COMMENTlink written 16 months ago by ekob2010
0
gravatar for qaedi.65
25 days ago by
qaedi.650
qaedi.650 wrote:

Dear All, Is there somebody to help me to add hg38 reference genome to my local instance Galaxy in a simple way? I read your posts and the wiki to find the solution. But all of them are too advance for me. I am a newbie in the bioinformatics and really need help.

Any comment is highly appreciated. Hami

ADD COMMENTlink modified 25 days ago • written 25 days ago by qaedi.650
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 81 users visited in the last hour