I have a brand new install of slurm with two nodes. I just managed to get slurm working.
Galaxy web reports no jobs.
Galaxy's main logs are filling with:
galaxy.jobs.runners.slurm WARNING 2017-12-06 15:50:59,820 (91/40) Job was reported by drmaa as terminal but job state in SLURM is: PENDING, returning to monitor queue galaxy.jobs.runners.drmaa DEBUG 2017-12-06 15:51:00,830 (91/40) state change: job finished, but failed
I can't seem to find out what to do about this.
root@slurm-controller:~# squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
root@slurm-controller:~# sacct -s PD,R
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
40 g91_uploa+ bioinfo 1 PENDING 0:0
Is this a slurm problem? A Galaxy problem?
Thanks!
Chad Matsalla