I just put in production a small CloudMan based Galaxy server for our institute and in addition to specific tools that some users requested, there is a need to install some genome files and indices not present on the vanilla server. Precisely, I am trying to install the fasta for the UCSC version of the S. cerevisiae sacCer3. I read the help (and watch the video tutorial) on the Data Managers tool with the expectation that I would be able to fetch a copy of the fasta from UCSC using the correct tool then build the indices using the other tools.
However, all my attempt to do that failed... Am'I missing something? Would someone be willing to share their experience and steps? Seems easy for sacCer2 on the video but I just can't get it to work with other files...
Can you elaborate on what exactly failed when you attempted to follow the tutorial?
Since the genome in not in the drop box panel from the tool (my understanding is that this is populated from the tool-data/shared/ucsc/builds.txt file). I tried filling in the input with info (putting sacCer3 everywhere) or not entering anything. Same result, here is the error from the history
ERROR: Unable to read builds for site file /mnt/galaxy/galaxy-app/lib/galaxy/util/../../../tool-data/shared/ucsc/ucsc_build_sites.txt
Thanks a lot for all the great support!
As a side note, the bowtie2 data manager works great to populate bowtie2/tophat2 with indices from genomes in the drop down menu!!! I wish I could set the download volume as I am filling my data volume with files that could be located elsewhere... (I did some poking around to see how difficult it would be to make CloudMan jive with software raided volumes and I fear that this would involve major refactoring of the code...)
Dannon, not sure this is per design but, I went in and added the following line to the tool-data/shared/ucsc/builds.txt file using the following command line:
sudo echo 'sacCer3 S. cerevisiae Apr. 2011 (SGD/sacCer3) (sacCer3)' >> /mnt/galaxy/galaxy-app/tool-data/shared/ucsc/builds.txt
Restarted Galaxy, then went back to the fasta download data manager, selected the scaCer3 genome, used the sacSer3 as UCSC dbkey and started the download.
It went perfectly fine (although the tools results was a error report... weird...). I was then able to build tophat2 indexes.
Is that expected??? I'd like to upload hg38, is the first step to populate the builds.txt file?? One could perhaps automate that step through the GUI first and then start the download...
Let me know if the tool work as advertised or if you think it should behave the way I initially thought (i.e. no editing of the builds.txt file from the command line...)