4.5 years ago by
United States
So, the short answer here is that the 'is_binary' check is failing to detect this particular file. The first 100 characters all happen to be printable (see lib/galaxy/util/__init__.py: is_binary() and lib/galaxy/datatypes/checkers.py: check_binary if you're really interested). Since the upload method fails to detect it as binary (or as a datatype with to_posix_lines set to False), the line endings are automatically converted.
I'm not sure why this is even being attempted with the extension being manually set, but there may be a good reason I'm not aware of, so I'll look into it -- I don't see this having changed recently. A really hacky fix for right now that results in the file being correctly detected is bumping up the temp.read(100) in check_binary to include more sample bytes from the file.
Edit* This should be resolved in the following commit that'll be in the next release:
https://bitbucket.org/galaxy/galaxy-central/commits/8b6e1ffaa053f87c944290f3a84e5f73633cd901
Can you check the file size once uploaded to establish if the change happens on upload, download (or worse, do both damage the binary file)?
How do I check the uploaded file size?
Just wanted to bump this. Has anyone had this problem? It is stopping us from using our instance of Galaxy..