Question: Downloading job's results fail
0
gravatar for blumr04
4.5 years ago by
blumr040
United States
blumr040 wrote:

I recently ran a job on the Galaxy server using the 'Extract MAF blocks'. 

The run was performed on 25714 genomic regions and was completed over night successfully (so it seems). 

By 'view details' I see that the size of the file that was generated is 8.1 GB. 

 

I am making attempts to download the file to my computer however the downloading seems to fail. I have made several attempts which all just starting and after very few seconds turned off. The downloading doesn't seem to continue afterwards. 

Is there a good way / alternative method to download the output file? 

 

Please advise. 

rna-seq galaxy • 1.6k views
ADD COMMENTlink modified 4.5 years ago by Daniel Blankenberg ♦♦ 1.7k • written 4.5 years ago by blumr040

What browser are you using? Chrome is often the best for downloading huge files. Also you might want to look into some command line tools as wget (http://askubuntu.com/questions/207265/how-to-download-a-file-from-a-website-via-terminal) or curl.

ADD REPLYlink modified 4.5 years ago • written 4.5 years ago by Martin Čech ♦♦ 4.9k
1
gravatar for Jennifer Hillman Jackson
4.5 years ago by
United States
Jennifer Hillman Jackson25k wrote:

HI,

From a shell/unix/terminal window on your computer, use the utility curl.

The link can be obtained by right clicking the floppy disk icon inside a history item and choosing "Copy Link Location" (for most datasets) or "Download Dataset/Download bam_index" (for BAM datasets there are two downloads). Once you have the <link>, type this (where "$" indicates the terminal prompt), so that the <link> is inside of single quotes:

  $ curl -O '<link>' 

Our wiki has the same info here:
https://wiki.galaxyproject.org/Support#Downloading_data

Hope this helps! Jen, Galaxy team

ADD COMMENTlink written 4.5 years ago by Jennifer Hillman Jackson25k

Dear Jennifer,

Thanks for your advise.

I tried running the crul command - something like: curl -O <file_link> > merged_No_NM_NOT_hg19RefseqNMonly3C_Above200bp_OL_Blanket.maf

However, while the download starts and beginning to create the output file on my computer it only reaches to a partial size of the complete size of the output file that is on Galaxy. I have an 8GB file on Galaxy and only 31M were downloaded in one attempt, while 117M were downloaded in another attempt - but altogether the curl stops in the middle and never completed the task.

Would you know by any chance how could I obtain my output file? Maybe using an FTP to the Galaxy server? Maybe adding some options to the curl command?...

I failed to use the wget command - maybe you could get me the correct syntax that would work with files that are stored on Galaxy server?

Thanks a lot again, Roy

(edited to remove dataset link, to maintain privacy - Jen, Galaxy team)

ADD REPLYlink modified 4.5 years ago by Jennifer Hillman Jackson25k • written 4.5 years ago by blumr040
1

I clicked your link and currently am downloading your file at 6MB/s with 2GB already downloaded. Chrome.

ADD REPLYlink written 4.5 years ago by Martin Čech ♦♦ 4.9k

I removed your dataset link from the post. Next time you can send these to us direct if asked to share: galaxy-bugs@bx.psu.edu

That said, just using a direct download is often a solution, as Martin stated and tested. The "curl" option is just another way. You do not need to use a redirect " > outfile" with the -O option. You can use the lower case -o option instead. Or just rename the file after download. Here is a link an online curl man page for MAC OS, where you can review the syntax for -O and -o, but stop there, many of these options will not work on the public Main server and are not needed anyway:
https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man1/curl.1.html

My guess is that there is an instability in the network connection or possibly a slower connection type such as DSL? If you can find a faster wi-fi connection that has better stability, either of these should work on an 8G file. For really large files, I personally would go ahead an use curl because I prefer the feedback about progress.

Best, Jen, Galaxy team

ADD REPLYlink modified 4.5 years ago • written 4.5 years ago by Jennifer Hillman Jackson25k
1
gravatar for Daniel Blankenberg
4.5 years ago by
Daniel Blankenberg ♦♦ 1.7k
United States
Daniel Blankenberg ♦♦ 1.7k wrote:

I prefer to use wget with the continue option to download large files. e.g., "wget -c URL".

ADD COMMENTlink written 4.5 years ago by Daniel Blankenberg ♦♦ 1.7k

Agreed! And I've added the option to the wiki: https://wiki.galaxyproject.org/Support#Downloading_data

(this option was not always available, but is now!)

ADD REPLYlink written 4.5 years ago by Jennifer Hillman Jackson25k

Dear Jennifer,

My file managed to complete full download (eventually...) I simply repeated execution of the curl -o command : curl -o myfile https://usegalaxy.org/datasets/bbd44e69cb8906b5f2c3ec5bc184fca5/display?to_ext=maf several times probably until the communication became better.... We are not using wifi here - so im wondering why the communication was so poor to start with.

However, my attempt to execute the wget command simply failed - with the following output, which maybe you could interpreted and advise me as for why it actually fails..?

wget -c https://usegalaxy.org/datasets/bbd44e69cb8906b5f2c3ec5bc184fca5/display?to_ext=maf --2014-05-14 17:34:46-- https://usegalaxy.org/datasets/bbd44e69cb8906b5f2c3ec5bc184fca5/display?to_ext=maf Resolving usegalaxy.org usegalaxy.org)... 129.114.60.179, 129.114.60.180 Connecting to usegalaxy.org usegalaxy.org)|129.114.60.179|:443... connected. ERROR: The certificate of usegalaxy.org' is not trusted. ERROR: The certificate ofusegalaxy.org' hasn't got a known issuer.

I should note that the link I've use here is the same link that actually worked for me with the curl command(!) So must be something about the way the wget should be executed...

Thanks a lot,

Roy

Roy Blum, Ph.D. Senior Research Scientist Laura and Issac Perlmutter Cancer Center New York University Langone School of Medicine 552 First Ave. Smilow 2106 New York, NY, 10016 Mob: +1 (646)-716-2875 Lab: +1 (212)-263-2327 http://blumroy.googlepages.com<http: blumroy.googlepages.com=""/> <http: blumroy.googlepages.com=""/>


From: Jennifer Hillman Jackson on Galaxy Biostar [notifications@biostars.org] Sent: Wednesday, May 14, 2014 5:19 PM To: Blum, Roy Subject: [galaxy-biostar] C: Downloading job's results fail

Activity on a post you are following on Galaxy Biostar<http: biostar.usegalaxy.org="">

User Jennifer Hillman Jackson<http: biostar.usegalaxy.org="" u="" 254=""/> wrote Comment: Downloading job's results fail<http: biostar.usegalaxy.org="" p="" 7576="" #7592="">:

Agreed! And I've added the option to the wiki: https://wiki.galaxyproject.org/Support#Downloading_data

(this option was not always available, but is now!)

ADD REPLYlink written 4.5 years ago by blumr040
1

You can add the --no-check-certificate argument to wget when you get a certificate error.

ADD REPLYlink written 4.5 years ago by Daniel Blankenberg ♦♦ 1.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 169 users visited in the last hour