Question: exporting dataset (history) or Bam file to another server
gravatar for ahdee
4.6 years ago by
United States
ahdee30 wrote:

Dear all: 

I'm trying to either move a Bam file generated from tophat or move an entire history set.  To do the latter I tried exporting to a file.  When the file is done is a tar.gz file.  I then go to the new server and try to import file ( I tried both the link or the downloading the tar.gz) however the result is the same, "importingcry..." "This history will be visible when the import is complete"      It just gets stuck on this screen and nothing happens.  Anyone have a clue as to how to do this?  Ok so what I did next was to try and upload the individual Bam files generated from Tophat, however when I do this I keep getting back some traceback error ;( so I'm stuck.  I also tried to "share" library with a link, however the results are the same, with the same message about history will be visible when the import is complete.  Would totally be grateful if someone can help me with this! thanks all. 


software error export • 2.1k views
ADD COMMENTlink modified 4.6 years ago by Jennifer Hillman Jackson25k • written 4.6 years ago by ahdee30
gravatar for Jennifer Hillman Jackson
4.6 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello Adhee,

We apologize for the current state of import/export by URL using the history menu options. We thought we had this functional (but still needing some improvements, such as better feedback about progress!) - however a thorough round of testing today prompted by your question revealed that there is indeed a problem with the "Import from File" function. 

The "Export to File" function is fine from our testing, so you should be able to download an intact tar archive of work. And in some cases, an import may be successful. But it isn't always, and it should be. And the preparation of that archive when export is initiated takes some time to process - the larger the history, the longer this takes - and with no feedback about this processing it can be difficult to know when it is ready.

First, I want to point you to our development ticket where we have captured the items we plan on working on to improve this suite of functions. You can upvote here, add comments about other usage features you would find helpful, follow progress while we work on this, and the like (as can all others reading this post!). We count on community feedback and usability is very important to us. John Chilton is leading the effort and if anyone can tackle this with savvy and grace, I for one believe that he is certainly up for the challenge! If you haven't used our Trello system before, the wiki link after the ticket link can help to get you acquainted:

"So, all this is great, but I JUST WANT TO MOVE MY DATA!" 

I am going to recommended a few things:

1. Go ahead and download your complete histories as tar files and save them as backups. Once the upload function is operational, these will be useful again. That said, I would probably not permanently delete anything that I dearly loved at this point. Just as a precaution. Wait until uploaded is confirmed as functional in another Galaxy and the integrity of that export is 100% confirmed. 

2. Move what you need to move one dataset at a time. Use 'curl' to download large files and FTP to upload large files.

3. Re-run analysis that you need to at the other Galaxy, if you want to make this less tedious/move less data around. Then, if you have the resources/time, consider extracting a workflow from the history, editing it so that it reproduces the full processing, publish it and put it in the Tool Shed, and install it as a repository into your cloud/local Galaxy to recreate the rest of the data there. This probably involves a few new steps you haven't done before. These functions are not for developers - they are written by developers - developers may use these when supporting scientists but that's the point. The end goal of all of this is to get some analysis done - and that means it is FOR scientists! We need to keep working on it until scientists can use it. Try it, give feedback, we'll do our best to make usability BETTER.

Of course #3 is totally optional, just thought I'd include it while we were on the subject. Publishing workflows in the Tool Shed is a somewhat new feature, super useful, and (I suspect) not widely known or understood. Why would this be useful? Apart of the distinction of being an author of a published analysis method intended for use on ANY Galaxy (a wider context than publishing on the Main instance, although any could be converted to repositories!)...  Different Galaxies have different tools sets installed natively. Sometimes workflows need to be adjusted, or new tools need to be installed to meet the processing requirements for a workflow, or new local data needs to be set up. A lot of work has gone into making the guided install of workflow-dependent tools as straightforward as possible. There is likely even more that could be done. We want feedback about this too! We can make guesses about what is needed, but nothing beats real use cases and requests from the community. And should everything go just perfectly exactly as you envisioned it (!), and you are just running a local for your own use, or a cloud for occasional use, these are empowering skills to have in your pocket. 

The tool shed is one the best documented parts of Galaxy. Greg and Dave B. offer great support. Get started here:

I gave you the long answer. But this included information for how we handle things when issues come up, how we communicate with the community about that, and our various feedback routes. A reference of sorts for the new forum. I may reformat into a Forum page, but for now I took the opportunity here, for everyone that may not be aware of the available resources or processes in place.

Thanks for your patience and again, our sincere apologies for the current inconveniences,

Jen, Galaxy team



ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by Jennifer Hillman Jackson25k

Hi Jennifer, thanks for such a thoughtful response!  Look forward to the the patch.  I've been doing what you suggest, but got stuck again in step #2 since for some reason uploading Bam files generated from Tophat always gives some kinda weird traceback error.  My only solution was to upload my raw trimmed dataset and rerun tophat on my new galaxy server.  Other than that everything works nicely.  

thanks again.


ADD REPLYlink written 4.6 years ago by ahdee30

Has there been any progress with this? I'm still trying to move huge histories [~80 GB]  between Galaxy Cloudman clusters and nothing really happens at the Import end. Downloading the history as a compressed file would take 24 hrs, meanwhile the cluster has to be running, costing $$. Will transferring be possible soon?



ADD REPLYlink written 3.2 years ago by madkisson30

Hey Michael,

Export/import functionality has seen several improvements since this post was made, but if you're still having trouble then there's probably more room for improvement.  I thought we'd sorted out the errors you were seeing with this when we chatted out-of-band, though, and that the archive simply wasn't done generating yet -- did something else go wrong?


ADD REPLYlink written 3.2 years ago by Dannon Baker3.7k

Hey Dannon

Yeah sorry about that - I got busy again and didn't follow up. I did what you recommended and kept the export link and continually checked it. It took some time [82 GB history] but it eventually got to where it read 'export is complete' and hit ting the link began a download process. Importing, however, never happened. I selected Import From File from the drop down menu, pasted the link, and just left it over night - nothing. Nothing ever happened. It just kept reading that it was importing and will be visible when finished.



ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by madkisson30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 165 users visited in the last hour