Question: Disk Space rquired to Install a Galaxy Instance on a Cluster (for small lab)
0
gravatar for tendai
3.6 years ago by
tendai20
United States
tendai20 wrote:

I have found my galaxy account very useful but every so often I have to download or delete data from that account to be able to do additional analyses. When I requested our Bioinformatics Core people if they could install a Galaxy Instance for our lab on one of their “super” computers (>1200 processor cores and a total of more than 5400 GB of memory), they asked me how much disk space we would need. Since I am not very computer literate and am still fairly new to using Galaxy, I have decided to ask the Galaxy community. The IT guys will take care of the installation and here are my questions:
1.    Would a disk capacity of 500 GB be enough for concurrently analyzing say 2 dog whole genome sequences (WGS) with ~15X average coverage (Illumina) from doing QC to calling and comparing variants? What would be the minimum disk capacity required to do the kind of job that I have just described above? (We have >20 WGSs but we will be happy to analyze 2-4 concurrently).

2.    I have never used Galaxy anywhere other than from within my account at www.usegalaxy.org. Now, will a Galaxy Instance installed on our “super” computer have the same easy to use interface or we will need to use command language when using our Galaxy Instance?
I hope I have included all the information necessary to get the answer I am looking for. Thank you in advance.

galaxy • 1.1k views
ADD COMMENTlink modified 3.6 years ago by Hotz, Hans-Rudolf1.8k • written 3.6 years ago by tendai20
1
gravatar for Hotz, Hans-Rudolf
3.6 years ago by
Switzerland
Hotz, Hans-Rudolf1.8k wrote:

Hi Tendai

Let's start with the good news: A local Galaxy installation will have the same look and feel as 'usegalaxy.org'. Of course this all depends on whether the tools are installed or not. Hence you need to provide your IT guys a list of tools you require. An out-of-the-box Galaxy installation comes with a few pre-insatlled tools, but most likely the tools you required must be extra installed via the tool shed ( https://wiki.galaxyproject.org/ToolShed ).

WRT to required disk space, you probably know more than I do, as I am not very familiar with variant calling. Based on your experience on 'usegalaxy.org' you probaly know already how much disk space is required per experiment. Assuming, the raw data (i.e fastq files) are alredy on the cluster, I recommend to work with data libraries ( https://wiki.galaxyproject.org/Admin/DataLibraries/Libraries ) in order to avoid data duplication.

As an additional point: you need to discuss with your IT guys who will be an Galaxy admin ( https://wiki.galaxyproject.org/Admin/Interface ).

 

I hope, this will get you started

Hans-Rudolf

 

 

ADD COMMENTlink written 3.6 years ago by Hotz, Hans-Rudolf1.8k

Hi Hans-Rudolph, Thank you very much for your response. This helps a lot! -tendai

From: Hotz, Hans-Rudolf on Galaxy Biostar [mailto:notifications@biostars.org] Sent: Tuesday, April 14, 2015 2:48 AM To: Mutangadura, Tendai Subject: [galaxy-biostar] A: Disk Space rquired to Install a Galaxy Instance on a Cluster (for small lab)

Activity on a post you are following on Galaxy Biostar<http: biostar.usegalaxy.org="">

User Hotz, Hans-Rudolf<http: biostar.usegalaxy.org="" u="" 179=""/> wrote Answer: Disk Space rquired to Install a Galaxy Instance on a Cluster (for small lab)<http: biostar.usegalaxy.org="" p="" 11684="" #11687="">:

Hi Tendai

Let's start with the good news: A local Galaxy installation will have the same look and feel as 'usegalaxy.org'. Of course this all depends on whether the tools are installed or not. Hence you need to provide your IT guys a list of tools you require. An out-of-the-box Galaxy installation comes with a few pre-insatlled tools, but most likely the tools you required must be extra installed via the tool shed ( https://wiki.galaxyproject.org/ToolShed ).

WRT to required disk space, you probably know more than I do, as I am not very familiar with variant calling. Based on your experience on 'usegalaxy.org' you probaly know already how much disk space is required per experiment. Assuming, the raw data (i.e fastq files) are alredy on the cluster, I recommend to work with data libraries ( https://wiki.galaxyproject.org/Admin/DataLibraries/Libraries ) in order to avoid data duplication.

As an additional point: you need to discuss with your IT guys who will be an Galaxy admin ( https://wiki.galaxyproject.org/Admin/Interface ).

I hope, this will get you started

Hans-Rudolf

ADD REPLYlink written 3.6 years ago by tendai20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 169 users visited in the last hour