Using Data Managers for uploading new genome

Question: Using Data Managers for uploading new genome

4.5 years ago by

United States

I just put in production a small CloudMan based Galaxy server for our institute and in addition to specific tools that some users requested, there is a need to install some genome files and indices not present on the vanilla server. Precisely, I am trying to install the fasta for the UCSC version of the S. cerevisiae sacCer3. I read the help (and watch the video tutorial) on the Data Managers tool with the expectation that I would be able to fetch a copy of the fasta from UCSC using the correct tool then build the indices using the other tools.

However, all my attempt to do that failed... Am'I missing something? Would someone be willing to share their experience and steps? Seems easy for sacCer2 on the video but I just can't get it to work with other files...

Thanks

data managers galaxy cloudman • 1.8k views

ADD COMMENT • link •

modified 4.5 years ago by Dannon Baker ♦ 3.7k • written 4.5 years ago by Marco Blanchette • 10

Can you elaborate on what exactly failed when you attempted to follow the tutorial?

ADD REPLY • link written 4.5 years ago by Dannon Baker ♦ 3.7k

Since the genome in not in the drop box panel from the tool (my understanding is that this is populated from the tool-data/shared/ucsc/builds.txt file). I tried filling in the input with info (putting sacCer3 everywhere) or not entering anything. Same result, here is the error from the history

ERROR: Unable to read builds for site file /mnt/galaxy/galaxy-app/lib/galaxy/util/../../../tool-data/shared/ucsc/ucsc_build_sites.txt

Thanks a lot for all the great support!

ADD REPLY • link modified 4.5 years ago • written 4.5 years ago by Marco Blanchette • 10

As a side note, the bowtie2 data manager works great to populate bowtie2/tophat2 with indices from genomes in the drop down menu!!! I wish I could set the download volume as I am filling my data volume with files that could be located elsewhere... (I did some poking around to see how difficult it would be to make CloudMan jive with software raided volumes and I fear that this would involve major refactoring of the code...)

ADD REPLY • link written 4.5 years ago by Marco Blanchette • 10

Dannon, not sure this is per design but, I went in and added the following line to the tool-data/shared/ucsc/builds.txt file using the following command line:

sudo echo 'sacCer3 S. cerevisiae Apr. 2011 (SGD/sacCer3) (sacCer3)' >> /mnt/galaxy/galaxy-app/tool-data/shared/ucsc/builds.txt

Restarted Galaxy, then went back to the fasta download data manager, selected the scaCer3 genome, used the sacSer3 as UCSC dbkey and started the download.

It went perfectly fine (although the tools results was a error report... weird...). I was then able to build tophat2 indexes.

Is that expected??? I'd like to upload hg38, is the first step to populate the builds.txt file?? One could perhaps automate that step through the GUI first and then start the download...

Let me know if the tool work as advertised or if you think it should behave the way I initially thought (i.e. no editing of the builds.txt file from the command line...)

ADD REPLY • link modified 4.5 years ago • written 4.5 years ago by Marco Blanchette • 10

4.5 years ago by

Daniel Blankenberg ♦♦ 1.7k

United States

Daniel Blankenberg ♦♦ 1.7k wrote:

In the upcoming Galaxy release (it is currently available in our next-stable branch), we will have support for adding additional genome builds via Data Managers without having to manually edit those files, but until then, editing the builds.txt file is the way to go.

Could you provide additional information/error output on the "although the tools results was a error report" issue?

ADD COMMENT • link written 4.5 years ago by Daniel Blankenberg ♦♦ 1.7k

You'll find that interesting... the history tab turns green, thumb up! but if you open the tab and look at view data, this is what's in window:

ERROR: Unable to read builds for site file /mnt/galaxy/galaxy-app/lib/galaxy/util/../../../tool-data/shared/ucsc/ucsc_build_sites.txt

This is why I was very confused when I initiated the download following the edit of the builds.txt file. Only when I was perusing the directory on the server that I realized the download had completed.

Let me know if you need more info. Looking forward to the next release!

ADD REPLY • link written 4.5 years ago by Marco Blanchette • 10

Similar posts • Search »