Question: GALAXY_SLOTS and slashes
3
gravatar for apozhitkov
17 months ago by
apozhitkov60
apozhitkov60 wrote:

Here is my experience with an exercise to make Galaxy utilize a multi-core / multi-processor environment of our server. Our setup is an Azure 16-cpu machine, which contains the latest release of Galaxy (17.05). My goal was to enable multithreading for the mapping tools such as RNA-STAR and TopHat (and any other ones that support multithreading).

I thought that must be easy, here is what the rg_rnaStar.xml tells us:

…
STAR
        --runThreadN \${GALAXY_SLOTS:-4}
…

Where is the GALAXY_SLOTS defined? The galaxyproject.org explains: “In all cases you need to include a parameter on your job_conf.xml that specifies the number of processes to be used for a given destination. If correctly defined your [sic] should see your GALAXY_SLOTS variable contain the specified value.”

  1. Looking for job_conf.xml. Not there. Found job_conf.xml.sample_basic and job_conf.xml.sample_advanced
  2. Somewhere it was written that job_conf.xml is not necessary. Without that file, Galaxy will spawn jobs assuming a uniprocessor machine.
  3. The job_conf.xml.sample_advanced looks complicated; it must be the right one!
  4. cp job_conf.xml.sample_advanced job_conf.xml
  5. Restart Galaxy.
  6. Tying to start RNA-STAR. Nothing’s happening.
  7. rm job_conf.xml, restart Galaxy.
  8. Starting RNA-STAR, working fine but clearly only 1 CPU is used.
  9. Let’s try cp job_conf.xml.sample_basic job_conf.xml, restart Galaxy
  10. Starting RNA-STAR, working fine but clearly only 1 CPU is used.
  11. More googling, discover more folks are asking the same question, how to set up the GALAXY_SLOTS variable? Lots of comments on SLURM, DRM. Irrelevant for me.
  12. Someone named “Galactic engineer” made a post “Using GALAXY_SLOTS with multithreaded Galaxy tools”, which refers to “Running Galaxy Tools on a Cluster” on galaxyproject.org
  13. All right, it seems simple: add a line < param id="local_slots" > 16 < / param > to job_conf.xml
  14. vi job_conf.xml, line aged, galaxy restarted.
  15. RNA-Star is still on one CPU. ARGGHHHHH!
  16. grep -r -i --include=*.sh 'GALAXY_SLOTS' ./
  17. It must be this one: …./lib/galaxy/jobs/runners/util/job_script/CLUSTER_SLOTS_STATEMENT.sh
  18. Edit, put 16 to explicitly set GALAXY_SLOTS, restart Galaxy
  19. RNA-Star is still on one CPU. ARGGHHHHH!
  20. Revert …STATEMENT.sh to what it was.
  21. Found another “statement”: …./.venv/lib/python2.7/site-packages/pulsar/managers/util/job_script/CLUSTER_SLOTS_STATEMENT.sh
  22. Edit, restart, RNA-STAR still on one CPU ARGGHHHHH!
  23. Revert …STATEMENT.sh to what it was. Lunch break. Frustration
  24. Another look at the job_conf.xml:

< destinations >

   < destination id="local" runner="local" / >

    < param id="local_slots" >16< / param >

< / destinations >

  1. What is that slash doing at the end after “local”? Looks like the tag is being closed
  2. OK, how about removing the slash and adding a closing tag:

    < destinations >

       < destination id="local" runner="local" >
    
        < param id="local_slots" >16< / param >
    
       < / destination >
    

    < / destinations >

  3. Restart.

  4. Run RNA-STAR. All processors are busy!
  5. TopHat – all processors are busy.
  6. Exhausted, but happy…

:-)

ADD COMMENTlink written 17 months ago by apozhitkov60

Great post. I've had similar issues with SLURM on docker galaxy which are only partially resolved.

ADD REPLYlink written 17 months ago by colindaven0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 183 users visited in the last hour