Question: Strange Behavior With Job-Queue
0
gravatar for Assaf Gordon
9.8 years ago by
Assaf Gordon320
United States
Assaf Gordon320 wrote:
Hello, On a local galaxy server, I've got into a strange situation: Several jobs are marked as "new", but non are starting. I've stop and re-started the server, and got the following message: galaxy.jobs.runners.local DEBUG 2009-01-26 19:29:00,829 5 workers ready galaxy.jobs.schedulingpolicy.roundrobin INFO 2009-01-26 19:29:00,829 RoundRobin policy: initialized galaxy.jobs INFO 2009-01-26 19:29:00,829 job scheduler policy is galaxy.jobs.schedulingpolicy.roundrobin:UserRoundRobin galaxy.jobs INFO 2009-01-26 19:29:00,829 job manager started galaxy.jobs DEBUG 2009-01-26 19:29:00,952 no runner: 7886 is still in new state, adding to the jobs queue galaxy.jobs DEBUG 2009-01-26 19:29:00,952 no runner: 7893 is still in new state, adding to the jobs queue galaxy.jobs DEBUG 2009-01-26 19:29:00,952 no runner: 7896 is still in new state, adding to the jobs queue galaxy.jobs DEBUG 2009-01-26 19:29:00,952 no runner: 7902 is still in new state, adding to the jobs queue galaxy.jobs DEBUG 2009-01-26 19:29:00,952 no runner: 7904 is still in new state, adding to the jobs queue galaxy.jobs DEBUG 2009-01-26 19:29:00,952 no runner: 7905 is still in new state, adding to the jobs queue galaxy.jobs DEBUG 2009-01-26 19:29:00,952 no runner: 7906 is still in new state, adding to the jobs queue galaxy.jobs DEBUG 2009-01-26 19:29:00,952 no runner: 7907 is still in new state, adding to the jobs queue galaxy.jobs DEBUG 2009-01-26 19:29:00,952 no runner: 7908 is still in new state, adding to the jobs queue galaxy.jobs INFO 2009-01-26 19:29:00,971 job stopper started But even after a server restart - no jobs are starting (I've waited for about a minute after restart). Is there any configuration setting that can cause these jobs to start if I restart the server? (or cause the 'stale' jobs to be deleted?) Thanks, Gordon.
galaxy • 1.1k views
ADD COMMENTlink modified 9.8 years ago by Nate Coraor3.2k • written 9.8 years ago by Assaf Gordon320
0
gravatar for Nate Coraor
9.8 years ago by
Nate Coraor3.2k
United States
Nate Coraor3.2k wrote:
Gordon, You may want to check the jobs' ancestors, generally they should only remain in the new state if jobs upon which they depend have not yet completed. --nate
ADD COMMENTlink written 9.8 years ago by Nate Coraor3.2k
Nate, Nate Coraor wrote, On 01/27/2009 09:18 AM: The same situation happened again on my Galaxy server. How do I check the jobs' ancestors ? Other than the jobs marked 'new', there are no other jobs (running or waiting). What I currently do is stop the server, Manually reset the jobs, with the following SQL command: UPDATE job set state='error' where state='new' ; And re-start Galaxy. There are some side-effects to this operation, as there are no running/waiting jobs but the history-list pane still shows running/waiting jobs. But without this manual intervention, new jobs queued by the users are no started. Thanks, Gordon.
ADD REPLYlink written 9.8 years ago by Assaf Gordon320
It's non-trivial, you have to query that job's input datasets and then query the state of the job that created those datasets. Can you try enabling the FIFO queue in universe_wsgi.ini? These should go away if you refresh the history pane? If not, there is something Very Wrong here. 'new' state jobs, in general, should not block the queue. This almost seems to indicate that something is killing the job queue thread. Any tracebacks in the log file? --nate
ADD REPLYlink written 9.8 years ago by Nate Coraor3.2k
Hello, Nate Coraor wrote, On 01/27/2009 01:22 PM: Well, my Galaxy database is definitely a mess. It had endured many crashes and exceptions, some of them happening in the middle of long workflows which left some jobs waiting on other non-existing jobs and datasets... So I'm guessing that most of what I'm seeing wouldn't really happen in a stable galaxy installation. However, there are two issues I've found which affect the perceived stability... First, the 'state' column in the DATASET table, regardless of the 'state' column in the JOB table. Stopping and Re-Starting Galaxy while there are DATASET marked as 'running' somehow affect the jobs (or maybe I just imagined it?). It might also affect the number of datasets as reported in the history pane - That is - in the DATASET table has datasets marked as 'new' / 'running' they will appear as new (grey) / running (yellow) even if there are no jobs running/waiting. Second, The 'visible' and 'deleted' columns in the HISTORY_DATASET_ASSOCIATION table. I somehow got into a situation where I had a row in that table with: INFO field = "Unable to Finish Job", DELETED field = FALSE, VISIBLE field = FALSE. That dataset appeared as an error (red box) in the history list, but when the user switched to that history, he (obviously) didn't see any red boxes (I guess because of VISIBLE=FALSE). very confusing indeed. I've removed all the 'running' things (datasets / jobs) and re-started Galaxy - I hope things will calm down now. Thanks for all your help. Gordon.
ADD REPLYlink written 9.8 years ago by Assaf Gordon320
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 180 users visited in the last hour