Question: Local Galaxy. Trimmomatic does not see fastq files.
1
gravatar for hamza_karakurt
14 months ago by
hamza_karakurt10 wrote:

Hello, I am new at Galaxy and trying to use it both local and in server. I downloaded some tools to local and tried to do an RNA-seq analysis. I tried to do this tutorial https://usegalaxy.org/u/jeremy/p/galaxy-rna-seq-analysis-exercise

In Galaxy Server, It is working actually but when I try it on my local galaxy (Ubuntu 16.04 via Oracle VirtualMachine), I can use FastQC for Quality control but when I try to use FastQ Quality Trimmer or Trimmomatic it does not see any fastq files. It is working on server but not in local.

What can be the problem of this error? Why trimming tools does not see a fastq file? Problem is in my VirtualBox or Local machine?

Thank you.

ADD COMMENTlink modified 14 months ago by zachary.gaber80 • written 14 months ago by hamza_karakurt10

Are the fastq datasets assigned the datatype fastqsanger or fastqsanger.gz? That is needed for the tool to recognize the inputs.

Related FAQs: https://galaxyproject.org/support/#getting-inputs-right-

Let's eliminate that sort of issue first, then troubleshoot server items if still needed. Let us know!

Jen, Galaxy team

ADD REPLYlink modified 14 months ago • written 14 months ago by Jennifer Hillman Jackson25k

Data sets assigned as fastqsanger.

I have no idea why I have this kind of error.

Thank you for your interest.

ADD REPLYlink written 14 months ago by hamza_karakurt10

Do you want to share a screenshot of the dataset so we can double check? Sometimes the datatype fastqcssanger is accidentally assigned (it is easy to mix up the two).

Also, the tutorial you are using is an older one. Better choices for learning Galaxy would be the RNA-seq and other tutorials here: https://galaxyproject.org/learn/

ADD REPLYlink written 14 months ago by Jennifer Hillman Jackson25k

Here is the screenshot of my history. Files are in fastqsanger format.

https://imgur.com/a/6HL3f

ADD REPLYlink modified 14 months ago • written 14 months ago by hamza_karakurt10

Hi - I can't see the assigned datatype in the graphic, just the dataset names. The assigned datatype needs to be fastqsanger.

I tried to find your example of the replicated history at https://usegalaxy.org but even after looked through the deleted histories, couldn't find one that matches. But it might be in there. If you want me to review, I'll need the history name and the "Create" date. You can find this info under "Saved histories" > Advanced search > status == all. Even if deleted, I may be able to see what is going on.

ADD REPLYlink written 14 months ago by Jennifer Hillman Jackson25k
0
gravatar for zachary.gaber
14 months ago by
United Kingdom
zachary.gaber80 wrote:

Hello, I had a similar issue with Trimmomatic earlier this week. I uploaded .gz files which Galaxy decompressed and identified as fastq files. Trimmomatic couldn't recognize the files when run from the tool bar - although oddly, workflows which had Trimmomatic as the first step in the process had absolutely no problems processing the fastq files. Weird.

Things like this happen to me rather frequently on Galaxy ever since I started using it several years ago - for whatever reason (whether it's the way my institute formats and compresses its fastq's or the way Galaxy decompresses and identifies file type) my fastq files are quite often in a format that many Galaxy tools have trouble recognizing. I'm not sure if I have the best workaround, but this is what I was taught to do by another Galaxy user: include a grooming step in my workflows. The "FASTQ GROOMER" tool seems to format my fastq files appropriately (they go in fastq, they come out fastqsanger) and after running the fastq files through it I never have this issue with tools working with my data. Just make sure you use the proper settings - the "FASTQ GROOMER" tool is more intended to switch between FASTQ score formats so make sure you tell the program what format your sequencing machine is scoring with or you might mess up your files.

Zachary

ADD COMMENTlink written 14 months ago by zachary.gaber80

Thanks for the feedback Zachary! The way that datatype will be assigned for fastq data upon upload is changing in the upcoming release (17.09). This is live right now at https://usegalaxy.org.

This should help a great deal with labeling data correctly and automatically that already has fastqsanger scaled quality scores. Non-fastqsanger will still require the groomer.

The release is a test/integration stage, so we'd like to learn about any new odd behaviors found (with this function or others).

I am going to see if I can reproduce the workflow Trimmomatic difference between tool panel/workflow launch. That is odd (and unexpected).

Thanks! Jen

ADD REPLYlink written 14 months ago by Jennifer Hillman Jackson25k

Update: For compressed fastq data, the datatype will need to be adjusted. You can do this upon upload by setting the datatype or after upload with edit attributes. If you use "autodetect" with the upload tool the data will simply have the datatype fastq.gz which is not enough for tools that expect some flavor of fastqsanger format.

To be clear: compressed datatype labels would be fastqsanger.gz or fastqsanger.bz2 depending on the original format. This FAQ covers the details:

I am still looking at why the tool input behavior is different when invoked with a workflow.

ADD REPLYlink modified 14 months ago • written 14 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 172 users visited in the last hour