Question: Setting up SLURM DrMMA for galaxy
1
gravatar for onspotproductions
15 months ago by
United States
onspotproductions40 wrote:

I am building a computing cluster for use with galaxy and have setup SLURm as the resource manager. I know the drmaa addon is necessary but am unsure the best way to compile it for use with galaxy. I found this guide and am curious as to any settings I may need to change before compiling.

http://gmod.827538.n3.nabble.com/Running-Galaxy-on-a-cluster-with-SLURM-td4051302.html

rna-seq cluster galaxy slurm • 750 views
ADD COMMENTlink written 15 months ago by onspotproductions40

You need to install python-drmaa, which serves as the glue between the libdrmaa.so provided by slurm and python/galaxy.

Edit: I forgot to mention that you need slurm-drmaa too, since that's what provides the libdrmaa.so library.

ADD REPLYlink modified 15 months ago • written 15 months ago by Devon Ryan1.8k

Is there any configuration necessary for either tool, other than modify the job conf file for galaxy? This is the person who posted the original question, as I can't access my google login account.

ADD REPLYlink written 14 months ago by djevo160

If you can run squeue and see the right queue from within the container then everything is working. Note that you need the proper munge.key as well, since that'll be key to the authentication (without this, squeue won't work).

ADD REPLYlink modified 14 months ago • written 14 months ago by Devon Ryan1.8k

One last piece of information I need. Do i need to install the drmaa python library inside of the galaxy venv or can I simply install it system wide?

ADD REPLYlink written 14 months ago by djevo160

I'm pretty sure I installed it inside the venv.

ADD REPLYlink written 14 months ago by Devon Ryan1.8k

Got it, also were there any environment variables that needed to be setup in order to use python drmaa with slurm?

ADD REPLYlink written 14 months ago by djevo160

No, the only other thing is setting the queue/partition to use in job_conf.xml (and any related settings).

ADD REPLYlink written 14 months ago by Devon Ryan1.8k

Do you have an example slurm job_conf I could look at?

ADD REPLYlink written 14 months ago by djevo160

Here's the relevant part:

<plugins workers="10">
    <plugin id="local" type="runner" load="galaxy.jobs.runners.local:LocalJobRunner"/>
    <plugin id="slurm" type="runner" load="galaxy.jobs.runners.slurm:SlurmJobRunner">
        <param id="drmaa_library_path">/usr/local/lib/libdrmaa.so</param>
    </plugin>
</plugins>
<destinations default="slurm">
    <destination id="local" runner="local">
        <param id="local_slots">4</param>
    </destination>
    <destination id="slurm" runner="slurm">
        <param id="embed_metadata_in_job">False</param>
        <param id="nativeSpecification">-p Galaxy</param>
        <env file="/galaxy-central/.venv/bin/activate" />
    </destination>
    <destination id="slurm4threads" runner="slurm">
        <param id="request_cpus">4</param>
        <param id="embed_metadata_in_job">False</param>
        <param id="nativeSpecification">-p Galaxy -n 4</param>
        <env file="/galaxy-central/.venv/bin/activate" />
    </destination>
    <destination id="slurm10threads" runner="slurm">
        <param id="request_cpus">10</param>
        <param id="embed_metadata_in_job">False</param>
        <param id="nativeSpecification">-p Galaxy -n 10</param>
        <env file="/galaxy-central/.venv/bin/activate" />
    </destination>
</destinations>
ADD REPLYlink written 14 months ago by Devon Ryan1.8k

What is the plugin workers identifier at the top? Is it the number of nodes in the cluster?

ADD REPLYlink written 14 months ago by djevo160

Modified the job_conf and am now getting this error

    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/mnt/VDisk/galaxy_production/galaxy/lib/galaxy/dependencies/__init__.py", line 112, in optional
        conditional = ConditionalDependencies( config_file )
      File "/mnt/VDisk/galaxy_production/galaxy/lib/galaxy/dependencies/__init__.py", line 21, in __init__
        self.parse_configs()
      File "/mnt/VDisk/galaxy_production/galaxy/lib/galaxy/dependencies/__init__.py", line 30, in parse_configs
        for plugin in ElementTree.parse( job_conf_xml ).find( 'plugins' ).findall( 'plugin' ):
      File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1182, in parse
        tree.parse(source, parser)
      File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 656, in parse
        parser.feed(data)
      File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1642, in feed
        self._raiseerror(v)
      File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
        raise err
    xml.etree.ElementTree.ParseError: mismatched tag: line 8, column 6

This is the job_conf config

<?xml version="1.0"?>
<!-- A sample job config that explicitly configures job running the way it is configured by default (if there is no explicit config). -->
<job_conf>
    <plugins>
        <plugin id="local" type="runner" load="galaxy.jobs.runners.local:LocalJobRunner" workers="4"/>
    <plugin id="slurm" type="runner" load="galaxy.jobs.runners.slurm:SlurmJobRunner">
        <param id="drmaa_library_path">/usr/local/lib/libdrmaa.so</param>
    </plugins>
    <handlers>
        <handler id="main"/>
    </handlers>
    <destinations default="slurm">
        <destination id="local" runner="local">
            <param id="local_slots">4</param>
        </destination>
        <destination id="slurm" runner="slurm">
            <param id="embed_metadata_in_job">False</param>
            <param id="nativeSpecification">-p Galaxy</param>
            <env file="/mnt/VDisk/galaxy_production/galaxy/.venv/bin/activate" />
    </destination>
    <destination id="slurm7threads" runner="slurm">
            <param id="request_cpus">7</param>
            <param id="embed_metadata_in_job">False</param>
            <param id="nativeSpecification">-p Galaxy -n 7</param>
            <env file="/mnt/VDisk/galaxy_production/galaxy/.venv/bin/activate" />
    </destinations>
</job_conf>
ADD REPLYlink modified 14 months ago • written 14 months ago by djevo160

You're missing a </plugin> line before the </plugins> line, that's what the "mismatched tag" means (i.e., you have a start tag with no end tag).

Regarding the "workers", see here.

ADD REPLYlink written 14 months ago by Devon Ryan1.8k

Thank you, I figured that problem out. Last question, is there a way to have galaxy use a specific slurm partition other than galaxy? It attempts to use the partition, however the existing cluster has a different partition name.

ADD REPLYlink written 14 months ago by djevo160

Sure, just change the -p Galaxy part to something else. The configuration that I posted is particular to my cluster, you'll have to adjust the -p option as appropriate.

ADD REPLYlink written 14 months ago by Devon Ryan1.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 98 users visited in the last hour