Question: Jobs on local galaxy instance stuck in "New" state - Data Library datasets to blame?
0
gravatar for Lance Parsons
4.1 years ago by
Lance Parsons110
United States
Lance Parsons110 wrote:

After updating our Galaxy instance to use two processes (one for web, the other as a job handler), I ran into an issue in a few cases.  I've noticed that a number of jobs get stuck in the "new" status.

In a number of cases, I've resolved the issue by downloading and uploading one of the input files and rerunning the job using the newly uploaded file.  In at least one of these cases, the offending input file was one that was copied from a Data Library.

Can anyone point me to something to look for in the database, etc. that would cause a job to think a dataset was not ready for use as a job input?  I'd very much like to fix these datasets since having to re-upload data libraries would be very tedious.

I'm running latest_2014.08.11

admin software error galaxy • 1.1k views
ADD COMMENTlink modified 4.0 years ago • written 4.1 years ago by Lance Parsons110
2
gravatar for Nate Coraor
4.1 years ago by
Nate Coraor3.2k
United States
Nate Coraor3.2k wrote:

Hi Lance,

The "job readiness check" looks at the state of a job's input datasets to make sure that they are all "ok". If a job is not being picked up to run, that is most likely the case. You can work back to a job's inputs using the `job_to_input_dataset` table, which connects `job`s to `history_dataset_association`s. That connects to `dataset`, where you can see the job's state.

The entire query for job readiness looks like this in its SQLAlchemy form:

hda_not_ready = self.sa_session.query(model.Job.id).enable_eagerloads(False) \
        .join(model.JobToInputDatasetAssociation) \
        .join(model.HistoryDatasetAssociation) \
        .join(model.Dataset) \
        .filter(and_( (model.Job.state == model.Job.states.NEW ),
                     or_( ( model.HistoryDatasetAssociation._state == model.HistoryDatasetAssociation.states.FAILED_METADATA ),
                          ( model.HistoryDatasetAssociation.deleted == True ),
                          ( model.Dataset.state != model.Dataset.states.OK ),
                          ( model.Dataset.deleted == True) ) ) ).subquery()
ldda_not_ready = self.sa_session.query(model.Job.id).enable_eagerloads(False) \
        .join(model.JobToInputLibraryDatasetAssociation) \
        .join(model.LibraryDatasetDatasetAssociation) \
        .join(model.Dataset) \
        .filter(and_((model.Job.state == model.Job.states.NEW),
                     or_((model.LibraryDatasetDatasetAssociation._state != None),
                         (model.LibraryDatasetDatasetAssociation.deleted == True),
                         (model.Dataset.state != model.Dataset.states.OK),
                         (model.Dataset.deleted == True)))).subquery()
    jobs_to_check = self.sa_session.query(model.Job).enable_eagerloads(False) \
        .filter(and_((model.Job.state == model.Job.states.NEW),
                     (model.Job.handler == self.app.config.server_name),
                     ~model.Job.table.c.id.in_(hda_not_ready),
                     ~model.Job.table.c.id.in_(ldda_not_ready))) \
        .order_by(model.Job.id).all()

ADD COMMENTlink modified 4.1 years ago • written 4.1 years ago by Nate Coraor3.2k
0
gravatar for Lance Parsons
4.0 years ago by
Lance Parsons110
United States
Lance Parsons110 wrote:

Thanks Nate, that was very helpful.  

I suspect that there are multiple reasons for stuck jobs, but one rather troublesome group of them for me turned out to be caused by datasets that were marked as deleted in the dataset table, but not marked as deleted in the history dataset association table, and thus were used as inputs to jobs.

The following query fixed the stuck jobs:

update dataset
set deleted = 'f'
and purgable = 'f'
where id in
(select distinct(d.id)
    from dataset d
    join history_dataset_association hda on d.id = hda.dataset_id
    join job_to_input_dataset jtid on hda.id = jtid.dataset_id
    join job j on jtid.job_id = j.id
    where d.deleted = 't' and hda.deleted = 'f' and j.state = 'new');

However, there seem to many instances where I find datasets marked as deleted but where the history dataset association is not marked as deleted.  I'm wary of updating them all without knowing how they got set this way (or even if this is sometimes an appropriate state).  Is this ever a valid state for a dataset?

ADD COMMENTlink written 4.0 years ago by Lance Parsons110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 165 users visited in the last hour