SOLVED: Configure a tool to request a slurm node with a lot of RAM or "How do I change a tool's <job> attributes?"

Question: SOLVED: Configure a tool to request a slurm node with a lot of RAM or "How do I change a tool's <job> attributes?"

11 months ago by

chadmatsalla • 40

chadmatsalla • 40 wrote:

Hi All,

This is a generic question regarding slurm and DRMAA. I have a tool. Let's call it "data_manager_star_index_builder". It needs a lot of ram for this specific dataset.

Is there a way to request that for a single job (not a job type, a single job) a slurm node with a lot of ram is used?
How would I confine a given tool "data_manager_star_index_builder" to use slurm nodes with a lot of ram

Thanks!

Chad Matsalla

admin job config slurm drmaa • 517 views

ADD COMMENT • link •

modified 11 months ago • written 11 months ago by chadmatsalla • 40

OK, I added a new runner in job_conf.xml:

        <destination id="slurm_cluster_bigmem" runner="slurm">
            <env file="/storage/app/galaxy/galaxy-csm9-server/.venv/bin/activate"/>
            <param id="enabled" from_environ="GALAXY_RUNNERS_ENABLE_SLURM">true</param>
            <param id="nativeSpecification" from_environ="NATIVE_SPEC">--ntasks=1 --share --mem=500000</param>
            <!-- <param id="mem" from_environ="SLURM_MEM">--mem=500000</param> -->
    </destination>

Then I changed the tool's data in :

<tool id="rna_star_index_builder_data_manager" name="rnastar index2" tool_type="manage_data" version="0.0.4" profile="17.01" destination="slurm_cluster_bigmem">

I restarted Galaxy web. The change didn't take place. I reset the metadata for data_manager_star_index_builder. Still using nodes with little memory.

Log says:

Persisting job destination (destination id: slurm_cluster)

Did I make the changes in the right place?

ADD REPLY • link written 11 months ago by chadmatsalla • 40

11 months ago by

chadmatsalla • 40

chadmatsalla • 40 wrote:

I read the Galaxy Tool XML Schema and <tool> doesn't seem to support destination="". Hmm.

I read the Galaxy Job Configuration page and found that the <tools> collection in job_conf.xml seems to do what I need. Sadly, there was no example anywhere. I put this in there:

<tools>
     <tool id="toolshed.g2.bx.psu.edu/repos/iuc/data_manager_star_index_builder/rna_star_index_builder_data_manager/0.0.4" destination="slurm_cluster_bigmem" />
</tools>

And I restarted Galaxy web.

Nope.

I remember one time troubleshooting a problem that I thought was related to galaxy-web. Restarting the handlers fixed it. So...

systemctl restart galaxy-handler@{0..3}.service

Now seen in the logs:

Persisting job destination (destination id: slurm_cluster_bigmem)

Bingo!!

ADD COMMENT • link written 11 months ago by chadmatsalla • 40

Glad that worked out and thanks for posting back the solution.

The FAQ here covers this topic, for others that may be reading: https://galaxyproject.org/admin/config/performance/cluster/

Admin topics are covered in even more detail at: https://github.com/galaxyproject/dagobah-training

And the search at https://galaxyproject.org/ is a handy resource to find docs, prior Q&A, and the like across Galaxy's resources. The term "slurm" will find similar information to what you posted (including some posts here at Galaxy Biostars).

ADD REPLY • link modified 11 months ago • written 11 months ago by Jennifer Hillman Jackson ♦ 25k

Similar posts • Search »