Question: Decompressing files on local galaxy
0
gravatar for ChickenRNA
2.2 years ago by
ChickenRNA50
ChickenRNA50 wrote:

HI, I am using local instance of Galaxy. I have uploaded my dataset and they are in fastq.gz format. The first step that I need to perform is use "Fastq groomer" to get my files in the correct format for TopHat. When using the web based version of Galaxy I believe it automatically decompresses the files in the background, and we can readily use the uploaded files in "Fastq groomer". However, that is not the case in the local instance of Galaxy. Is there a separate tool that needs to be installed for the local instance of galaxy to decompress the .gz files? Or do files need to be decompressed before uploading? Thanks in advance for your time.

ADD COMMENTlink modified 2.2 years ago by Jennifer Hillman Jackson25k • written 2.2 years ago by ChickenRNA50
1

Did you upload them or link them into "shared data"? If you do anything except link them in then they'll be uncompressed.

ADD REPLYlink written 2.2 years ago by Devon Ryan1.9k

I added them to the data libraries using a shared data directory. Is there any way to address it without having to decompress all the files before adding them?

ADD REPLYlink written 2.2 years ago by ChickenRNA50
1

Some of the Galaxy wrappers aren't written in a way to handle gzipped files. Perhaps fastq groomer is one of those (in any case, its output will be uncompressed, which I agree is a design flaw, since uncompressed fastq files should never exist).

ADD REPLYlink written 2.2 years ago by Devon Ryan1.9k
2
gravatar for Jennifer Hillman Jackson
2.2 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

By default, uploaded files are uncompressed when added to a history. Was there some administrative change to leave them compressed (it is an option)? Or as the file just still named "gz" but the dataset actually uncompressed? The file names are not changed upon upload.

If uncompressed, maybe the tool is not picking them up because of a missing datatype. It should be assigned as "fastq" or some version of fastq. https://wiki.galaxyproject.org/Support#Tool_doesn.27t_recognize_dataset

And finally, you might not need to groom. See this wiki about how to tell if the datatype "fastqsanger" can just be assigned. https://wiki.galaxyproject.org/Support#FASTQ_Datatype_QA

Let us know if this does not address the issue, Jen, Galaxy team

ADD COMMENTlink written 2.2 years ago by Jennifer Hillman Jackson25k

Thank you Jennifer! I did not make any changed to the administrative settings to leave them uncompressed. Instead of uploading all the files, I did link them to a folder in the directory as I am dealing with a large data set, could that have been the issue? If so is there way to address that?

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by ChickenRNA50
1

If these were added to a Data Library without loading (aka "copy") into Galaxy, then there is no way to uncompress them. Instead, you have two choices:

1) Use the Data Library upload method that actually copies the data into Galaxy. This will uncompress them.

2) Uncompress the files in the linked directory and re-add them to Galaxy in that format.

Reference: https://wiki.galaxyproject.org/Admin/DataLibraries/UploadingLibraryFiles

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by Jennifer Hillman Jackson25k

When I try to copy into Galaxy I run out of space due to the size of my data set. Is there a way to copy the files directly into Galaxy files so, I don't have 2 copies of them?

ADD REPLYlink written 2.2 years ago by ChickenRNA50
1

One of the options when loading into a data library is not to copy the data into Galaxy. This is the option you probably want. Just make sure the files are uncompressed before doing this. The link I sent as reference has the instructions.

ADD REPLYlink written 2.2 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 176 users visited in the last hour