Galaxy Cloud cluster ran out of storage

Question: Galaxy Cloud cluster ran out of storage

4.0 years ago by

United States

ravi.alla • 10 wrote:

Hi,

I used galaxy provided share string to launch a cloudman cluster for a workshop. This shared galaxy instance comes with only 20GB of storage for the galaxy volume. We quickly ran out of storage half way through the workshop. This brought the workshop to a grinding halt (embarrassing I know). I don't understand how we used up 20GB so fast with only ~10 users on two different instances. Must've been the data libraries that the instance came with. When I tried to increase the storage using the cloudman interface, it says resizing the storage volume, but it never did. I tried this multiple times without success. This caused all jobs to fail. Anyone else see the same issue? How do I fix this in the future? It is nice to have the share-string for workshops but if it comes with only 20GB galaxy volume and no way to increase it, I might have to look to other options (like making a snapshot to a larger volume?)

Thank you

Ravi

galaxy cloudman • 935 views

ADD COMMENT • link •

modified 4.0 years ago by Dannon Baker ♦ 3.7k • written 4.0 years ago by ravi.alla • 10

4.0 years ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hello Ravi,

You were using the Teach share instance described here?
http://wiki.galaxyproject.org/Teach/WorkshopAMI

The shared instance is small to conserve resources while in a share state. But, once in use for a workshop, you will want to start up a larger instance (at least 2x, but consider 8x if doing assembly), using the shared AMI, and then increase disk resources through the Admin interface. 500G is our standard for 10 users per instance, with one node added per user (10).

I have only scaled up disk resources when there was no other activity on the instance, but I am not certain if that is a requirement or not. Feedback from our lead cloud developer, Dannon, will be the definitive answer.

Very sorry that you experienced a problem. Our team will reply with more soon. Jen, Galaxy team

ADD COMMENT • link written 4.0 years ago by Jennifer Hillman Jackson ♦ 25k

4.0 years ago by

Dannon Baker ♦ 3.7k

United States

Dannon Baker ♦ 3.7k wrote:

Do you have the logs available from when you tried to resize? We might be able to figure out more about what exactly happened and why it failed if you do.

Like Jen mentioned, the base volume is only 20GB intentionally for several reasons and comes mostly full. For workshops we generally resize the base volume immediately before doing anything else since the disk resizing code will terminate Galaxy (temporarily) and unmount the volume that jobs are working on (removing it from the share), so it will definitely kill all currently running jobs, after which it'll restarart everything. It definitely isn't a procedure that can be run seamlessly while work is going on.

Anyway, my best guess is that something was blocking the unmounting of your data volume so the resize couldn't actually happen, but I'd need to see the logs to figure out what happened.

ADD COMMENT • link written 4.0 years ago by Dannon Baker ♦ 3.7k

Please log in to add an answer.

Similar posts • Search »