Question: Setting up SLURM DrMMA for galaxy
1
onspotproductions • 40 wrote:
I am building a computing cluster for use with galaxy and have setup SLURm as the resource manager. I know the drmaa addon is necessary but am unsure the best way to compile it for use with galaxy. I found this guide and am curious as to any settings I may need to change before compiling.
http://gmod.827538.n3.nabble.com/Running-Galaxy-on-a-cluster-with-SLURM-td4051302.html
You need to install python-drmaa, which serves as the glue between the libdrmaa.so provided by slurm and python/galaxy.
Edit: I forgot to mention that you need slurm-drmaa too, since that's what provides the libdrmaa.so library.
Is there any configuration necessary for either tool, other than modify the job conf file for galaxy? This is the person who posted the original question, as I can't access my google login account.
If you can run
squeue
and see the right queue from within the container then everything is working. Note that you need the proper munge.key as well, since that'll be key to the authentication (without this,squeue
won't work).One last piece of information I need. Do i need to install the drmaa python library inside of the galaxy venv or can I simply install it system wide?
I'm pretty sure I installed it inside the venv.
Got it, also were there any environment variables that needed to be setup in order to use python drmaa with slurm?
No, the only other thing is setting the queue/partition to use in job_conf.xml (and any related settings).
Do you have an example slurm job_conf I could look at?
Here's the relevant part:
What is the plugin workers identifier at the top? Is it the number of nodes in the cluster?
Modified the job_conf and am now getting this error
This is the job_conf config
You're missing a
</plugin>
line before the</plugins>
line, that's what the "mismatched tag" means (i.e., you have a start tag with no end tag).Regarding the "workers", see here.
Thank you, I figured that problem out. Last question, is there a way to have galaxy use a specific slurm partition other than galaxy? It attempts to use the partition, however the existing cluster has a different partition name.
Sure, just change the
-p Galaxy
part to something else. The configuration that I posted is particular to my cluster, you'll have to adjust the-p
option as appropriate.