I have been unable to upload data files into Galaxy Main since Friday
18th May 2012. Today is my fourth day of attempting uploads.
Refreshing and leaving the files to upload overnight does not work.
Although Jennifer has stated the bug has been fixed at 5.30pm today I
am still unable to upload data files. I thought I may be exceeding
maximum file capacity but I am well below at only 1.8Gb.
Are you still experiencing problems now? Galaxy may have been busy
immediately following the resolution of the cluster problem, although
your problem does appear to be unrelated.
It sounds like you are uploading file through a browser. A better
would be to use FTP. This is required for datasets approaching or
exceeding 2G in size.
Files that are < 2G, really any file over ~ 500MB, can also benefit
FTP upload. An FTP client tracks the progress of an upload and can
resume an interrupted transfer. http://wiki.g2.bx.psu.edu/FTPUpload
Hopefully this helps,
I ran a few tests and found that changing the file suffix to .txt when
using the "autodetect" upload type function speed up the loading
considerably. As the final result is an identical Galaxy dataset to
is produced with using the existing suffix, this is something I would
recommend that you try next time.
For my test, I took one of your files and change the suffix directly,
other changes were made to the content, as it was already a
tab-delimited text file. I didn't continue with the testing to specify
the datatype at upload (tabular would be the correct choice), but this
is a change that may also speed up import slightly, although the .txt
suffix change was dramatic alone and the upload was quick (I ran a
side-by-side comparison of an original and .txt-suffix modified file).
The general reason behind this is that Galaxy will interpret data to
detect and confirm datatypes during upload to create associated
needed for tool use. Detection is a convenience option that comes at a
cost (compute resource and time). If you can provide this information
instead, the detection portion of the process can be avoided,
confirmation and metadata creation can be started directly, and the
result is a quicker upload.
Hopefully this helps for next time,
If some columns of data were empty, then there should be two adjacent
tabs in your data, and Galaxy would leave a blank, empty value in that
column. You will need to check how your application actually output
data, possibly pad the empty values, and consider to exporting
as plain, unformulated text if at all possible.
You could also try padding these empty values with a null value.
Commonly used values are a dot "." and a zero "0". Different tools
expect different null values, see the tool forms for expected input
formats. Completely blank values will cause problems with many tools.
Probably also would want to avoid " Convert spaces to tabs:", unless
know this will produce the correct output.