Downloading job's results fail

Question: Downloading job's results fail

4.5 years ago by

blumr04 • 0

United States

blumr04 • 0 wrote:

I recently ran a job on the Galaxy server using the 'Extract MAF blocks'.

The run was performed on 25714 genomic regions and was completed over night successfully (so it seems).

By 'view details' I see that the size of the file that was generated is 8.1 GB.

I am making attempts to download the file to my computer however the downloading seems to fail. I have made several attempts which all just starting and after very few seconds turned off. The downloading doesn't seem to continue afterwards.

Is there a good way / alternative method to download the output file?

Please advise.

rna-seq galaxy • 1.6k views

ADD COMMENT • link •

modified 4.5 years ago by Daniel Blankenberg ♦♦ 1.7k • written 4.5 years ago by blumr04 • 0

What browser are you using? Chrome is often the best for downloading huge files. Also you might want to look into some command line tools as wget (http://askubuntu.com/questions/207265/how-to-download-a-file-from-a-website-via-terminal) or curl.

ADD REPLY • link modified 4.5 years ago • written 4.5 years ago by Martin Čech ♦♦ 4.9k

4.5 years ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

HI,

From a shell/unix/terminal window on your computer, use the utility curl.

The link can be obtained by right clicking the floppy disk icon inside a history item and choosing "Copy Link Location" (for most datasets) or "Download Dataset/Download bam_index" (for BAM datasets there are two downloads). Once you have the <link>, type this (where "$" indicates the terminal prompt), so that the <link> is inside of single quotes:

  $ curl -O '<link>'

Our wiki has the same info here:
https://wiki.galaxyproject.org/Support#Downloading_data

Hope this helps! Jen, Galaxy team

ADD COMMENT • link written 4.5 years ago by Jennifer Hillman Jackson ♦ 25k

Dear Jennifer,

Thanks for your advise.

I tried running the crul command - something like: curl -O <file_link> > merged_No_NM_NOT_hg19RefseqNMonly3C_Above200bp_OL_Blanket.maf

However, while the download starts and beginning to create the output file on my computer it only reaches to a partial size of the complete size of the output file that is on Galaxy. I have an 8GB file on Galaxy and only 31M were downloaded in one attempt, while 117M were downloaded in another attempt - but altogether the curl stops in the middle and never completed the task.

Would you know by any chance how could I obtain my output file? Maybe using an FTP to the Galaxy server? Maybe adding some options to the curl command?...

I failed to use the wget command - maybe you could get me the correct syntax that would work with files that are stored on Galaxy server?

Thanks a lot again, Roy

(edited to remove dataset link, to maintain privacy - Jen, Galaxy team)

ADD REPLY • link modified 4.5 years ago by Jennifer Hillman Jackson ♦ 25k • written 4.5 years ago by blumr04 • 0

I clicked your link and currently am downloading your file at 6MB/s with 2GB already downloaded. Chrome.

ADD REPLY • link written 4.5 years ago by Martin Čech ♦♦ 4.9k

I removed your dataset link from the post. Next time you can send these to us direct if asked to share: galaxy-bugs@bx.psu.edu

That said, just using a direct download is often a solution, as Martin stated and tested. The "curl" option is just another way. You do not need to use a redirect " > outfile" with the -O option. You can use the lower case -o option instead. Or just rename the file after download. Here is a link an online curl man page for MAC OS, where you can review the syntax for -O and -o, but stop there, many of these options will not work on the public Main server and are not needed anyway:
https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man1/curl.1.html

My guess is that there is an instability in the network connection or possibly a slower connection type such as DSL? If you can find a faster wi-fi connection that has better stability, either of these should work on an 8G file. For really large files, I personally would go ahead an use curl because I prefer the feedback about progress.

Best, Jen, Galaxy team

ADD REPLY • link modified 4.5 years ago • written 4.5 years ago by Jennifer Hillman Jackson ♦ 25k

4.5 years ago by

Daniel Blankenberg ♦♦ 1.7k

United States

Daniel Blankenberg ♦♦ 1.7k wrote:

I prefer to use wget with the continue option to download large files. e.g., "wget -c URL".

ADD COMMENT • link written 4.5 years ago by Daniel Blankenberg ♦♦ 1.7k

Agreed! And I've added the option to the wiki: https://wiki.galaxyproject.org/Support#Downloading_data

(this option was not always available, but is now!)

ADD REPLY • link written 4.5 years ago by Jennifer Hillman Jackson ♦ 25k

Dear Jennifer,

My file managed to complete full download (eventually...) I simply repeated execution of the curl -o command : curl -o myfile https://usegalaxy.org/datasets/bbd44e69cb8906b5f2c3ec5bc184fca5/display?to_ext=maf several times probably until the communication became better.... We are not using wifi here - so im wondering why the communication was so poor to start with.

However, my attempt to execute the wget command simply failed - with the following output, which maybe you could interpreted and advise me as for why it actually fails..?

wget -c https://usegalaxy.org/datasets/bbd44e69cb8906b5f2c3ec5bc184fca5/display?to_ext=maf --2014-05-14 17:34:46-- https://usegalaxy.org/datasets/bbd44e69cb8906b5f2c3ec5bc184fca5/display?to_ext=maf Resolving usegalaxy.org usegalaxy.org)... 129.114.60.179, 129.114.60.180 Connecting to usegalaxy.org usegalaxy.org)|129.114.60.179|:443... connected. ERROR: The certificate of usegalaxy.org' is not trusted. ERROR: The certificate ofusegalaxy.org' hasn't got a known issuer.

I should note that the link I've use here is the same link that actually worked for me with the curl command(!) So must be something about the way the wget should be executed...

Thanks a lot,

Roy

Roy Blum, Ph.D. Senior Research Scientist Laura and Issac Perlmutter Cancer Center New York University Langone School of Medicine 552 First Ave. Smilow 2106 New York, NY, 10016 Mob: +1 (646)-716-2875 Lab: +1 (212)-263-2327 http://blumroy.googlepages.com<http: blumroy.googlepages.com=""/> <http: blumroy.googlepages.com=""/>

From: Jennifer Hillman Jackson on Galaxy Biostar [notifications@biostars.org] Sent: Wednesday, May 14, 2014 5:19 PM To: Blum, Roy Subject: [galaxy-biostar] C: Downloading job's results fail

Activity on a post you are following on Galaxy Biostar<http: biostar.usegalaxy.org="">

User Jennifer Hillman Jackson<http: biostar.usegalaxy.org="" u="" 254=""/> wrote Comment: Downloading job's results fail<http: biostar.usegalaxy.org="" p="" 7576="" #7592="">:

Agreed! And I've added the option to the wiki: https://wiki.galaxyproject.org/Support#Downloading_data

(this option was not always available, but is now!)

ADD REPLY • link written 4.5 years ago by blumr04 • 0

You can add the --no-check-certificate argument to wget when you get a certificate error.

ADD REPLY • link written 4.5 years ago by Daniel Blankenberg ♦♦ 1.7k

Please log in to add an answer.

Roy

Similar posts • Search »