Question: Data Storage
7.5 years ago by
Fanny Coffin20
Fanny Coffin20 wrote:
Hi, I'm trying to evaluate the possibility to use Galaxy on our production environment for NGS data. And I've a question about the data storage. So, NGS provides huge files that we store on our servers in a specific folder organisation. By using Galaxy, these files have to be uploaded (in order to fill in the database with information like the first lines, the fields...). But I'm wondering whether these files necessarily have to be imported in the Galaxy workspace or whether they can just be linked? My question comes from the fact that we absolutely would like to avoid data duplication. Could you please enlighten me about that? Thanks in advance. Cordially. Fanny COFFIN
modified 7.5 years ago by Greg Von Kuster810 • written 7.5 years ago by Fanny Coffin20
7.5 years ago by
Penn State University
Greg Von Kuster810 wrote:
Hello Fanny, You should upload your files to a Galaxy Data Library as the upload form for data libraries allows you to upload directories of files or files from filesystem paths. Either of these options allows you to not make copies of your files. See our wiki at central/wiki/DataLibraries/UploadingFiles for details on options for uploading files to a data library. For details about data libraries, see our wiki at central/wiki/DataLibraries/Libraries. Greg Von Kuster Galaxy Development Team
written 7.5 years ago by Greg Von Kuster810
Thanks a lot for your quick answers Greg and Davide! So it looks like that's completely feasible because we have a ZFS. And about the user permissions on file defined on our fileserver : if I well understood, the actual Galaxy version doesn't take them into account, but it could be possible in further developments? Greg Von Kuster a écrit :
written 7.5 years ago by Fanny Coffin20
7.5 years ago by
Davide Cittaro140 wrote:
Hi, AFAIK most of the data will be duplicated in uploading/importing. I suggest you to deploy galaxy on a filesystem that has deduplication capabilities. I've successfully installed galaxy on Nexenta CP3 + ZFS (waiting for Illumos). Recent ZFS builds support deduplication and compression. HTH d /* Davide Cittaro Cogentech - Consortium for Genomic Technologies via adamello, 16 20139 Milano Italy tel.: +39(02)574303007 e-mail: */
written 7.5 years ago by Davide Cittaro140
