Question: Galaxy PBS system
gravatar for jasperkoehorst
3.8 years ago by
jasperkoehorst10 wrote:

I am currently setting up a PBS system to work with Galaxy.

When submitting a job I see the following through the galaxy debug screen and eventually returning with an error stated within galaxy. As qstat states that the jobs went fine I am uncertain what is going wrong here.

Also it states: /galaxy/galaxy-dist/database/job_working_directory/000/27 but when I look I only see /1 or /2 and not /27...

Job output not returned by PBS: the output datasets were deleted while the job was running, the job was manually dequeued or there was a cluster error. DEBUG 2015-02-16 07:29:46,527 (27) Working directory for job is: /galaxy/galaxy-dist/database/job_working_directory/000/27 DEBUG 2015-02-16 07:29:46,558 (27) Dispatching to pbs runner DEBUG 2015-02-16 07:29:47,912 (27) Persisting job destination (destination id: batch) INFO 2015-02-16 07:29:48,074 (27) Job dispatched DEBUG 2015-02-16 07:29:48,901 (27) command is: python3.4 /galaxy/galaxy-dist/tools/vaap/SAPP/1_conversion/ '-input' '/galaxy/galaxy-dist/database/files/000/dataset_14.dat' -output '/galaxy/galaxy-dist/database/files/000/dataset_27.dat' -sourcedb "genbank" -format "genbank"; return_code=$?; cd /galaxy/galaxy-dist; /galaxy/galaxy-dist/ ./database/files /galaxy/galaxy-dist/database/job_working_directory/000/27 . /galaxy/galaxy-dist/config/galaxy.ini /galaxy/tmp/tmpLjCWqL /galaxy/galaxy-dist/database/job_working_directory/000/27/galaxy.json /galaxy/galaxy-dist/database/job_working_directory/000/27/metadata_in_HistoryDatasetAssociation_27_WAE9rW,/galaxy/galaxy-dist/database/job_working_directory/000/27/metadata_kwds_HistoryDatasetAssociation_27_JJRjQw,/galaxy/galaxy-dist/database/job_working_directory/000/27/metadata_out_HistoryDatasetAssociation_27_L1LeGv,/galaxy/galaxy-dist/database/job_working_directory/000/27/metadata_results_HistoryDatasetAssociation_27_sedwZX,,/galaxy/galaxy-dist/database/job_working_directory/000/27/metadata_override_HistoryDatasetAssociation_27_DaWNYB; sh -c "exit $return_code" DEBUG 2015-02-16 07:29:48,938 (27) submitting file /galaxy/galaxy-dist/database/pbs/ DEBUG 2015-02-16 07:29:48,943 (27) queued in default queue as DEBUG 2015-02-16 07:29:48,984 (27) Persisting job destination (destination id: batch) DEBUG 2015-02-16 07:29:51,152 (27/ PBS job state changed from N to R DEBUG 2015-02-16 07:30:38,774 (27/ PBS job state changed from R to C DEBUG 2015-02-16 07:30:38,774 (27/ PBS job has completed successfully WARNING 2015-02-16 07:30:38,775 Exit code  was invalid. Using 0. DEBUG 2015-02-16 07:30:38,900 setting dataset state to ERROR DEBUG 2015-02-16 07:30:39,395 job 27 ended
galaxy.datatypes.metadata DEBUG 2015-02-16 07:30:39,395 Cleaning up external metadata files
galaxy.datatypes.metadata DEBUG 2015-02-16 07:30:39,437 Failed to cleanup MetadataTempFile temp files from /galaxy/galaxy-dist/database/job_working_directory/000/27/metadata_out_HistoryDatasetAssociation_27_L1LeGv: No JSON object could be decoded


grid pbs • 1.6k views
ADD COMMENTlink modified 3.8 years ago by jmchilton1.1k • written 3.8 years ago by jasperkoehorst10
gravatar for jmchilton
3.8 years ago by
United States
jmchilton1.1k wrote:

The reason you do not see job_working_directory files is because Galaxy is deleting them - you can set (cleanup_job = never) in your galaxy.ini (or universe_wsgi.ini for older set ups).

Beyond that - the version of the PBS library Galaxy currently leverages is known to fail with some newer variants of the DRM backend. Can you open the eggs.ini file in Galaxy's root directory and replace "pbs_python = 4.3.5" with "pbs_python = 4.4.0" and let me know if that fixes the problem?

ADD COMMENTlink written 3.8 years ago by jmchilton1.1k

I indeed dont get the JSON error anymore, but still jobs are failing. The PBS was already set to pbs_python = 4.4.0:

See ini below:

repository =
; these eggs must be scrambled for your local environment
no_auto = pbs_python

bx_python = 0.7.2
Cheetah = 2.2.2
MarkupSafe = 0.12
mercurial = 3.2.4
MySQL_python = 1.2.3c1
PyRods = 3.2.4
numpy = 1.6.0
pbs_python = 4.4.0
psycopg2 = 2.5.1
pycrypto = 2.5
pysam = 0.4.2
pysqlite = 2.5.6
python_lzo = 1.08_2.03_static
PyYAML = 3.10
guppy = 0.1.10
SQLAlchemy = 0.7.9
; msgpack_python = 0.2.4

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by jasperkoehorst10

Next thing I would try is to set 'retry_job_output_collection' to 4 instead of the default of 0 in galaxy.ini - this works around issues where Galaxy is responding too fast to jobs and network file system caching becomes a problem.


ADD REPLYlink written 3.8 years ago by jmchilton1.1k

I even placed it at 40 and still no luck. However when I allow pbs to run on the master node I get python library issues that they cannot be found. For the applications in galaxy I use python3.4 and the command that is shown in debug mode works perfectly from the command line and also perfectly when I modify the job_conf.xml to local instead of pbs.


<?xml version="1.0"?>
        <!-- <plugin id="local" type="runner" load="" workers="4"/> -->
        <plugin id="pbs" type="runner" load=""/>
        <handler id="main"/>
    <destinations default="batch">
        <!-- <destination id="local" runner="local"/> -->
        <destination id="batch" runner="pbs"/>


ADD REPLYlink written 3.8 years ago by jasperkoehorst10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 168 users visited in the last hour