Question: Importing Very Large NGS DataSets
gravatar for mmoore
12 days ago by
United States
mmoore20 wrote:

I'm running my own Galaxy server, and I'm looking to import 20-40gb NGS fastq files for processing...... I know that's unusual..... :)

Currently I'm manually uploading to the FTP directory, and importing using "Choose FTP". The upload script completes in a short period of time, and the upload job is successfully started in the galaxy queue.

However the script is taking hours to complete for each file. Is there anyway to speed up the process by linking the files directly or by not performing some of the sanity checks contained within the script, or by another means?



ADD COMMENTlink modified 12 days ago by Jennifer Hillman Jackson25k • written 12 days ago by mmoore20

The majority of time you spent waiting when importing large datasets to Galaxy from a local filesystem is most probably during 'detecting metadata' step - when Galaxy is trying to reason about the data (count sequences etc.). This would go faster if you can get a faster machine to run the job.

Besides that I do not think there is much you can do since Galaxy needs metadata for every dataset.

ADD REPLYlink written 12 days ago by Martin Čech ♦♦ 4.8k

Thanks Martin, And yes I can pretty much confirm this is the case. one core, 100% utilized for 4+ hours on the script. Maybe time for me to have a closer look at that script :)

ADD REPLYlink written 12 days ago by mmoore20
gravatar for Jennifer Hillman Jackson
12 days ago by
United States
Jennifer Hillman Jackson25k wrote:


This method would probably work out for your case:

Once the fastq files are in a data library, they can be copied into histories as datasets to run the analysis. Copied datasets do not use up more disc space, they just link back to the original file in the library.

I also asked the developers if they had better ideas or more to add. Please follow their responses in Gitter (or they might reply directly back here):

Thanks! Jen, Galaxy team

ADD COMMENTlink modified 12 days ago • written 12 days ago by Jennifer Hillman Jackson25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 105 users visited in the last hour