Question: Jobs (Tophat/cufflinks) waiting to load for several days (don't seem to be queued yet)
0
gravatar for eboyle2
2.4 years ago by
eboyle20
United States
eboyle20 wrote:

Hello,

I'm running tophat/cufflinks/cuffdiff on Galaxy for what seems like is probably a large data set (I've got 3 replicates of 8 different conditions). I loaded everything Wednesday morning and it is still not done. It has run through all the Tophat runs and some of the Cufflinks, but the ones that have not run yet are still waiting to load and are not even queued as far as I can tell. (They have an exclamation mark next to them instead of a clock).

Do I need to make more space available in order for this to run? I am currently using 71% of the allotted space and based on previous data experiments, I think this will be enough for the file sizes. I've been deleting some large files as I go from the server (for instance, I deleted the initial files after I converted them because otherwise they were taking up too much space), and I could maybe delete the groomed files if I need to in order for it to run, however, I would prefer not to if it isn't necessary.

Or is this a problem because of the addition to running some jobs on Jetstream that is just causing general delays? I don't need this analysis done in any particular rush, but if it is sitting there indefinitely and I need to do something to correct it, I would like to know.

Thank you.

cufflinks galaxy • 861 views
ADD COMMENTlink modified 2.4 years ago by Jennifer Hillman Jackson25k • written 2.4 years ago by eboyle20

The 'exclamation mark state' means that the job is probably waiting on its inputs. Are you sure that you did not delete some of it while cleaning up?

ADD REPLYlink written 2.4 years ago by Martin Čech ♦♦ 4.9k

Thank you. I doublechecked and that doesn't seem to be the problem. However, I did notice that the Tophat runs that Cufflinks is waiting on do have a "An error occured while setting the metadata for this setHere is a what it looks like "

I think it is confused about which genome alignment to use but I can't correct it because it says that because this dataset is being used as input or output, metadata cannot be changed. Should I cancel the Cufflinks runs and see if I can update it? (I don't think it is this necessarily, since it was also a problem on several Tophat datasets that the Cufflinks did run on and that I updated after the run was complete).

ADD REPLYlink written 2.4 years ago by eboyle20

I would try cancelling the cufflinks job and running auto-detect to set the metadata on the tophat dataset. If all inputs are ready the 'exclamation mark state' should not last longer than a few minutes.

ADD REPLYlink written 2.4 years ago by Martin Čech ♦♦ 4.9k

I cancelled the remaining cufflinks (deleted then permanently deleted). I still was not able to change the metadata, even after waiting overnight and trying the next morning. (It still says that they are being used as input or output so metadata cannot be change).

ADD REPLYlink written 2.4 years ago by eboyle20
0
gravatar for Jennifer Hillman Jackson
2.4 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

I just examined the history. I can see some successful jobs (active/hidden/deleted) and others that were deleted at one of the stages before execution was completed - but I cannot see the prior metadata problems of deleted data, and none of the current successful/active I looked at had this issue (yet please reply with a dataset number if there is an example).

It appears that you have started over from Tophat, correct? Allow these jobs - both yellow and grey - to fully complete. They will not all run/execute at the same time, but the general order launched. A few jobs run at any particular time. As jobs complete, new ones will start (of the same type - jobs are dispatched to different clusters/queues depending on the allocated compute resource).

If you encounter metadata problems again, or if all of the jobs you have launched are completely queued for longer than a day again (meaning: no jobs are running in your account), please post back and we can examine in that state.

Very sorry that you had problems, Jen, Galaxy team

ADD COMMENTlink written 2.4 years ago by Jennifer Hillman Jackson25k

Yeah, I started over since it seemed to not be working. Thank you very much. I will let you know if any problems arise again.

ADD REPLYlink written 2.4 years ago by eboyle20

Update - Our admin Nate suggested (just for now) that your specific grey queued jobs be restarted with the compute resource set to send the jobs to Jetstream. There are two clusters running high resource jobs right now. One accepts 2 jobs (Default) and the other 4 (Jetstream), with 6 total concurrently executed, per user/account.

Changing the resource yourself will distribute your jobs better right now. We are tuning cluster dispatch so that the default option will more evenly distribute jobs to Jetstream for certain corner cases, like the one you ran into (tuning is in-progress as part of the Jetstream cluster integration).

This is how to modify the tool form: http://imgur.com/a/LvLXL

To make this easier to manage in the UI, I would suggest going through the active grey queued jobs from the bottom up. Use the dataset re-run button for a target job, change the cluster on the tool form, submit. Then delete the original job (all datasets) and do the same for the next target job. Four of these should start up very soon. More can be queued now or consider starting up roughly 2/3 rds of your jobs now with the modified cluster option.

Hope this helps! Jen

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by Jennifer Hillman Jackson25k

Thank you. When I reran everything the second time, it worked. I did wait to set up the cufflinks until after all the Tophat runs had been completed as a precaution (and I did have to reset the metadata for all of them) and everything worked out.

ADD REPLYlink written 2.4 years ago by eboyle20

Thanks for the update - glad it worked out. I'll follow up internally to find out why the metadata is a problem right now. Best, Jen

ADD REPLYlink written 2.4 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 173 users visited in the last hour