Question: How to successfully export large numbers of datasets outside of galaxy- as history 'export to file' will not work
1
gravatar for Guy Reeves
21 months ago by
Guy Reeves1.0k
Germany
Guy Reeves1.0k wrote:

Using the 'Download tip: Big data' from https://wiki.galaxyproject.org/Support I can get very large files out of Galaxy without having to download them to the machine I am running the web browser on. This works well for single files e.g.

curl -o bam_Fam1_n13_fam2_n18 'http://172.16.0.140/galaxy/datasets/f176ae4c569d530e/display?to_ext=bam'

But when I try to do the same thing for a link generated using history 'export to file' it does not work

curl -o G6.tar.gz 'http://darwin.evolbio.mpg.de/galaxy/history/export_archive?id=24f04bd897e6fbd3'

I can successfully download the tar.gz file from within galaxy to my hard drive by clicking on the link- but this is not what i want to do.

Is there any reason this is not working for compressed histories ? Thanks Guy

download curl wget history export • 961 views
ADD COMMENTlink modified 21 months ago • written 21 months ago by Guy Reeves1.0k
3
gravatar for Guy Reeves
21 months ago by
Guy Reeves1.0k
Germany
Guy Reeves1.0k wrote:

You will need to install from the toolshed ‘Bundle Collection Download a collection of files (Galaxy Version 1.0.1)’- unfortunately this is not currently available on usegalaxy.org. I have successfully tested this procedure for BAM files (you get a data and an index file) and fastq and fasta files.

1 using the checkbox button in the history panel select all the files you want to export. press the ‘for all selected’ button> and select ‘build dataset list). Name this list on the next window.

2 The run the tool ‘Bundle Collection’ select the dataset list you just created. This very useful tool, could have better documentation. It is important to understand that it does not create a .zip file but instead creates an .html link whic you can use to download a zip file. If you select the view button of the output dataset of this tool all you will see is a list of datasets which will be in the zip file. 3 The .thml link can be obtained by right clicking the floppy disk icon inside a history item and choosing "Copy Link Location" (for most datasets) or "Download Dataset/Download bam_index"

4 Once you have the <link>, type this (where "$" indicates the terminal prompt), so that the <link> is inside of single quotes. Like many commands, there are many options. These are examples commonly used with Galaxy.

$ wget -O '<link>'

$ wget -O --no-check-certificate '<link>' # ignore SSL certificate warnings

$ wget -c '<link>' # continue an interrupted download

$ curl -o outfile '<link>'

$ curl -o outfile --insecure '<link>' # ignore SSL certificate warnings

$ curl -C - -o outfile '<link>' # continue an interrupted download

I use curl -o test2.zip 'http://172.XX.X.XXX/galaxy/datasets/f1b20b93d50e5f10/display?to_ext=html'

This generates a zip file called test2

5 you can then

$ unzip test2.zip

this will create a folder called ‘Bundled_Collection’ with all your files (there will be two for each BAM file and one for all others.

Hope this works for you. Guy

ADD COMMENTlink modified 21 months ago • written 21 months ago by Guy Reeves1.0k
1
gravatar for Jennifer Hillman Jackson
21 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hi Guy,

Making history archives retrievable line command has been discussed before as a potential enhancement but there is no ticket yet. If you want to create an issue in the https://github.com/galaxyproject/galaxy repo that would be an appropriate place. Or I can do this - let me know your preference.

Hope all is going good otherwise! Jen

ADD COMMENTlink written 21 months ago by Jennifer Hillman Jackson25k

HI Jen

I think it would be a great feature to have, but I have no idea how much work it would require to do.

I will look into using tools from the toolshed that compress a list if datasets. I assume the compressed file could be then be successfully downloaded using wget or curl. I am looking into the tools 'ziptool' and 'bundle_collections'. I will post here what the results are.

I do however think that until something is done the 'Download tip: Big data' https://wiki.galaxyproject.org/Support should be edited to make it clear that exported histories cannot be done in this way the can only be downloaded using the download button.

Cheers Guy SEE RESOLUTION BELOW

ADD REPLYlink modified 21 months ago • written 21 months ago by Guy Reeves1.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 169 users visited in the last hour