Question: SOLVED: Configure a tool to request a slurm node with a lot of RAM or "How do I change a tool's <job> attributes?"
1
gravatar for chadmatsalla
11 months ago by
chadmatsalla40
chadmatsalla40 wrote:

Hi All,

This is a generic question regarding slurm and DRMAA. I have a tool. Let's call it "data_manager_star_index_builder". It needs a lot of ram for this specific dataset.

  1. Is there a way to request that for a single job (not a job type, a single job) a slurm node with a lot of ram is used?
  2. How would I confine a given tool "data_manager_star_index_builder" to use slurm nodes with a lot of ram

Thanks!

Chad Matsalla

admin job config slurm drmaa • 517 views
ADD COMMENTlink modified 11 months ago • written 11 months ago by chadmatsalla40

OK, I added a new runner in job_conf.xml:

        <destination id="slurm_cluster_bigmem" runner="slurm">
            <env file="/storage/app/galaxy/galaxy-csm9-server/.venv/bin/activate"/>
            <param id="enabled" from_environ="GALAXY_RUNNERS_ENABLE_SLURM">true</param>
            <param id="nativeSpecification" from_environ="NATIVE_SPEC">--ntasks=1 --share --mem=500000</param>
            <!-- <param id="mem" from_environ="SLURM_MEM">--mem=500000</param> -->
    </destination>

Then I changed the tool's data in :

<tool id="rna_star_index_builder_data_manager" name="rnastar index2" tool_type="manage_data" version="0.0.4" profile="17.01" destination="slurm_cluster_bigmem">

I restarted Galaxy web. The change didn't take place. I reset the metadata for data_manager_star_index_builder. Still using nodes with little memory.

Log says:

Persisting job destination (destination id: slurm_cluster)

Did I make the changes in the right place?

ADD REPLYlink written 11 months ago by chadmatsalla40
3
gravatar for chadmatsalla
11 months ago by
chadmatsalla40
chadmatsalla40 wrote:

I read the Galaxy Tool XML Schema and <tool> doesn't seem to support destination="". Hmm.

I read the Galaxy Job Configuration page and found that the <tools> collection in job_conf.xml seems to do what I need. Sadly, there was no example anywhere. I put this in there:

<tools>
     <tool id="toolshed.g2.bx.psu.edu/repos/iuc/data_manager_star_index_builder/rna_star_index_builder_data_manager/0.0.4" destination="slurm_cluster_bigmem" />
</tools>

And I restarted Galaxy web.

Nope.

I remember one time troubleshooting a problem that I thought was related to galaxy-web. Restarting the handlers fixed it. So...

systemctl restart galaxy-handler@{0..3}.service

Now seen in the logs:

Persisting job destination (destination id: slurm_cluster_bigmem)

Bingo!!

ADD COMMENTlink written 11 months ago by chadmatsalla40

Glad that worked out and thanks for posting back the solution.

The FAQ here covers this topic, for others that may be reading: https://galaxyproject.org/admin/config/performance/cluster/

Admin topics are covered in even more detail at: https://github.com/galaxyproject/dagobah-training

And the search at https://galaxyproject.org/ is a handy resource to find docs, prior Q&A, and the like across Galaxy's resources. The term "slurm" will find similar information to what you posted (including some posts here at Galaxy Biostars).

ADD REPLYlink modified 11 months ago • written 11 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 180 users visited in the last hour