Question: Issue with ${GALAXY_SLOTS} and SGE cluster
gravatar for david.roquis
19 months ago by
david.roquis30 wrote:

Dear Galaxy Community,

I have been trying to configure my job_conf.xml for a cluster usage. We are working with Son of the GRID ENGINE (SGE) v8.1.9, which has a default installation/configuration.

I have no problem submitting jobs from galaxy to SGE. However, the ${GALAXY_SLOTS} variable from various tools is never taken into account. The assigned value of this parameters is always 1 (no matter what was specified in the tool wrapper or my job_conf.xml), and the number of dedicated slots in SGE also ends up being always 1. I have spent a lot of time trying to figure out where this problems come from, but I was unsuccessful in my quest.

You can take a look at my (for now) basic job_conf below. I mostly followed the instructions I found here and they seemed to be consistent with other posts or websites I found about this topic.

 <?xml version="1.0"?>
<!-- A sample job config that explicitly configures job running the way it is configured by default (if there is no explicit config). -->
        <plugin id="local" type="runner" load="" workers="4"/>  
        <plugin id="drmaa_default" type="runner" load="" workers="10"/>
        <!-- Override the $DRMAA_LIBRARY_PATH environment variable -->
        <param id="drmaa_library_path">/opt/sge/lib/</param>   
    <handlers default="handlers">
        <handler id="handler0" tags="handlers"/>
        <handler id="handler1" tags="handlers"/>    
        <handler id="handler2" tags="handlers"/>    
        <handler id="handler3" tags="handlers"/>    
    <destinations default="sge_default">
        <destination id="sge_default" runner="drmaa_default"/>
       <param id="nativeSpecification">-R y -V -j n -pe smp 4</param>
    <destination id="local" runner="local"/>
       <param id="local_slots">4</param> 

I have verified, and I have indeed a parallel environment (-pe) called smp (with 999 slots). I don't know if it may be of any use for debugging, but a characteristics of this parallel environment is that the allocation rule is $pe_slots. Something else that I noticed is when I do qstat -j {jobnumber}, is that the parallel environment is not present in the job description when launched from galaxy.

I have the same issue when I use the local runner with local_slots 4 (after restarting Galaxy, of course), i.e. my GALAXY_SLOTS are set to 1 instead of 4 or the default value in the tool xml wrapper.

One final remark. When I take the command line from galaxy for my job and launch it manually from a terminal with the same cluster submission parameter (as such: qsub -N test_pe_smp -R y -V -j n -pe smp 4 my_galaxy_job.txt) , it works perfectly (and the parallel environment is present in the job description if I call qstat -j {jobnumber}.

I am sure I am doing something wrong, somewhere, but I haven't mange it figure it out yet. By the way, is there a way to see what is the command line sent from galaxy to SGE?

Thank you very much in advance!


ADD COMMENTlink modified 19 months ago by Devon Ryan1.9k • written 19 months ago by david.roquis30
gravatar for Devon Ryan
19 months ago by
Devon Ryan1.9k
Devon Ryan1.9k wrote:
<destinations default="sge_default">
    <destination id="sge_default" runner="drmaa_default">
        <param id="nativeSpecification">-R y -V -j n -pe smp 4</param>
        <param id="local_slots">4</param>
    <destination id="local" runner="local">
        <param id="local_slots">4</param> 

You not might need the local_slots param for the sge_default id, but you do need to correct your xml regardless.

ADD COMMENTlink modified 19 months ago • written 19 months ago by Devon Ryan1.9k

It works! Thanks a lot for your very precise,quick and efficient answer! I need to be more careful when working with XML code in the future.

ADD REPLYlink written 19 months ago by david.roquis30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 175 users visited in the last hour