Question: FTP upload problem -- Upload data size limit is set at 50 GB, per file
1
gravatar for aida.shahraki
8 months ago by
aida.shahraki10 wrote:

I have a problem with uploading my data to usegalaxy.org. I was trying to make FTP link of my data (two files each about 90GB) and uploading them to the galaxy, but after uploading of about 100GB, I received "file already exists" error and the transfer stopped. I would these data further in the galaxy and it's so important to upload them. Can anyone help? my username in galaxy is "aida.shahraki@gmail.com" Regards,

size upload file usegalaxy.org url • 370 views
ADD COMMENTlink modified 8 months ago by Jennifer Hillman Jackson25k • written 8 months ago by aida.shahraki10
0
gravatar for Jennifer Hillman Jackson
8 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

There is a 50 GB per-file size limit for uploaded datasets at Galaxy Main https://usegalaxy.org. When uploading BAM files, the maximum size is a bit smaller, due to how these data are indexed on the server -- about 25-35 GB per BAM file works the best.

Please consider using your own Galaxy server when analyzing very large datasets:

Finally, a workaround that is not necessarily recommended, because such large datasets may be too large to process once in Galaxy, or you'll exceed account quota (250 MB) but generally possible for certain limited use cases. You can try it if you want but will need to troubleshoot any problems if it doesn't work.

  • Split the data into smaller chunks, uploaded individually, then merged back together.
  • This works best for uncompressed plain text files: GTF/GFF3, BED, FASTQ, INTERVAL
  • In the Upload tool, directly assign the datatype "tabular" during upload to avoid the chunked data from being detected as known datatype. The goal is to preserve all original formatting. Do NOT use the option to convert "whitespace to tabs" (under the gear icon).
  • Then once all chunked data is in your history, merge the data back together with Concatenate.
  • At the end it is important to double check the formatting (compare character/word/line counts between the original and uploaded/merge file), view the file to confirm basic formatting, assign the final datatype (bed, fastqsanger, etc), then view the file again to make sure any metadata assigned is correct (important for some datatypes).

For others: Any public Galaxy server may have different limits set and I don't personally know of any that load more than 50 GB, but that Public server's project help/documentation might note this, or the admins could be contacted directly to find out.

Thanks! Jen, Galaxy team

ADD COMMENTlink modified 8 months ago • written 8 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 173 users visited in the last hour