Question: Uploaded files still being imported ito the history after 48h
1
gravatar for f.laforets
10 weeks ago by
f.laforets20
f.laforets20 wrote:

Hello,

I have uploaded several large (9-15 GB) files via FTP server. These were successfully uploaded. I then imported them into the history, after selecting the file type (.bam) and the genome (mm10). They quickly went from waiting jobs (grey), to running jobs (yellow). However, it has now been about 48h, and they are still on the yellow background, with the circling dots spinning. Only one of these jobs has been successfully imported, and is now on a green background.

I know these are large files, and in a rather large number (12). Uploaded large .bam files have taken several hours before to turn green, but 48h seems a bit much. I was therefore wondering if there were any issues, and if so if I should simply re-upload everything (which would take a lot of time!). Or maybe it's just a matter of the files being large and the server being too busy?

The upload was done to https://usegalaxy.org/

Thanks in advance for any help given!

Flo

galaxy bam • 308 views
ADD COMMENTlink modified 7 weeks ago by madisonivy1690 • written 10 weeks ago by f.laforets20

Hi Martin,

Thanks for your help. I will import it again, then!

Best wishes,

Flo

ADD REPLYlink written 9 weeks ago by f.laforets20
2
gravatar for Martin Čech
10 weeks ago by
Martin Čech ♦♦ 4.9k
United States
Martin Čech ♦♦ 4.9k wrote:

Hi Flo, we were upgrading usegalaxy.org to a new release 2 days ago and one of the handlers ended up in a bad shape. Sadly your job got randomly assigned to it and that is why it took so long and then failed. Please import it again and it should be much faster. Importing files after FTP upload should generally not take longer than an hour even for huge files.

ADD COMMENTlink written 10 weeks ago by Martin Čech ♦♦ 4.9k

Hi Martin, I seem to be having the same problem. Everytime I upload a group of files, they remain in the yellow spinning state for at a very long time (24h before I delete them and start again, usually), except for on file which turned green. After checking, I realised that everytime, the only file to turn green has a size that is 2 to 3 times lower that its acutal size. I have done this several times this week, and always end up with the same results. Do you have any idea what I could do? Many thanks, Flo

ADD REPLYlink written 9 weeks ago by f.laforets20
0
gravatar for f.laforets
8 weeks ago by
f.laforets20
f.laforets20 wrote:

Hi Martin,

I'm afraid I keep having the same problem. When I upload my .bam files via FTP and import them into the history, they remain in the yellow state for a very long time. Oddly, one of the files (not always the same), turns green but the size of the file is always much below the actual size, which, I suppose, means there is some issue. Is it possible that there is still some issue with the new release? Or does the problem come from something I'm doing wrong?

Also, I noticed that whenever I import files or delete them (whether they're green or yellow), the percentage of allocated disk space used does not change. I have tried to refresh the history, log out, close my browser window and log in again, and nothing seems to change. I do not know whether the two problems are related but it seemed worth mentioning.

Any help would be greatly appreciated, Many thanks,

Florian

ADD COMMENTlink written 8 weeks ago by f.laforets20

Me and my colleague tested bam uploading extensively in an effort to reproduce your problems - including 20GB and 100GB files - and they seem to work correctly at the moment. Are you sure the FTP transfer finished and the files are the same? Are the sizes on the FTP server and on your disk the same/very similar?

ADD REPLYlink written 7 weeks ago by Martin Čech ♦♦ 4.9k

Hi Martin,

Thanks for testing this. Yes, I usually wait until the whole batch has finished uploading. I also check the size of the uploaded file in the "get data" window, and they are OK. I select the file type and the genome for the batch and click start. And then it seems to stay in the yellow state forever. I have tried multiple times and always obtained the same results.

If it works for you, it likely means that I'm doing something wrong, I just can't figure out what...

I can try again, though, and let you know how it goes. I'll upload the files over the weekend and import them on Monday.

ADD REPLYlink written 7 weeks ago by f.laforets20

The current history with bam uploads in progress is a good test. Would you please allow it to complete?

You can delete the .bam.bai data now or allow it to complete and delete it after. There is already one successful bam upload finished, so some of your uploads are working.

Details here and via email to your bug report: https://biostar.usegalaxy.org/p/29551/#29601

We should consolidate this Q&A. Either of you can choose a thread for the followup and we'll link then close out the other.

Thanks! Jen

ADD REPLYlink written 7 weeks ago by Jennifer Hillman Jackson25k

Hi Jen,

Thanks for your replies on both my posts! I am sorry but litterally minutes beofre I received your replies, I deleted all potentially incomplete files and started the upload again.

I was not aware I had uploaded any .bam.bai files. I will double check the files I upload from now on, but so far the extensions in windows are always .bam.

If I run into the same problem, I will make sure I don't delete the files, so that you guys can work on the potential bug.

I only think about his now, but when I come in the morning after leaving my galaxy session on for a long time, I have found several time an error message (error 504, I have the code, I can send it to you if necessary). Everytime I had this message most of my files were still in the yellow state (after importing for almost 24h). I'm not sure it is relevant to the current problem, but maybe it will help.

Thanks again for your help,

Flo.

ADD REPLYlink written 7 weeks ago by f.laforets20

My mistake - I was looking at a different history. Yes, all of your uploads are bam datastes (no index .bam.bai).

The names of the datasets indicate that these are query-name sorted bams (may or may not actual content, it is just the file name). Are you using autodetect for the datatype or directly assigning bam in the Upload tool?

There might be a conflict around this (and potentially a small bug, I'll run some tests), but for now, to get this data in, upload queryname sorted bams (or any bam) allowing the Upload tool to "autodetect" the datatype. You'll want the data coordinate-sorted when working in Galaxy for nearly all use cases anyway.

Most tools that consume query-name sorted bam now include an option (checked by default) to re-sort by queryname, if needed, during that tool's execution. No need to pre-sort.

The datatype bam means something special in Galaxy -- a coordinate sorted bam. Standardizing the format/datatype avoids many errors, extra sort steps before using downstream tools, and the like. Other sort/formats are covered by different datatypes: See "New BAM Datatypes" in these release notes. https://docs.galaxyproject.org/en/master/releases/18.01_announce.html. This is all somewhat new. If there is a problem, we'll want to fix it.

More feedback after we test. Thanks for reporting the problem and hopefully using the "autodetect" datatype assignment with "Upload" is a solution that can move you forward now.

ADD REPLYlink written 7 weeks ago by Jennifer Hillman Jackson25k
0
gravatar for f.laforets
7 weeks ago by
f.laforets20
f.laforets20 wrote:

Hi Jen,

Thanks a lot for the explanation. I'm importing the files now with the "autodetect" option. Hopfully this will work! I'll keep you posted.

Best wishes,

Flo

ADD COMMENTlink written 7 weeks ago by f.laforets20
1

It seems to have worked! The files are imported (green) and of the proper size. I'm going to run featureCounts ont the datasets now. Thanks for your help!

Flo

ADD REPLYlink written 7 weeks ago by f.laforets20

Good news & glad that worked out!

ADD REPLYlink written 7 weeks ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 168 users visited in the last hour