While I export RNAseq data from EBI SRA database. It came in xxx.fastq.qz format, which are not accepted by Tophat. I tried with FASTQ groomer. it also showing failed. How to solve this??
Hello,
The datatype "fastqsanger.gz" can also be assigned. This keeps the data in compressed format, reducing history/dataset content size that counts in account quota size. Be sure to not simply assign "fastqsanger" - tools will not then know that the data is uncompressed.
This is brand-new functionality related with the upcoming 17.01 update Galaxy release. Documentation about usage will be updated once all behavior tunings are complete, around the time of the official release (estimate: few weeks).
Fastqsanger format is required for tools that interpret quality scores for calculations. Confirming fastqsanger (Sanger Phred +33) format is important or low mapping rates will results (and sometimes errors, depending on the tool). How to check format/current quality score scaling in uploaded datasets with FastQC, and how to rescale (if needed) with the Fastq Groomer tool, is explained here: https://wiki.galaxyproject.org/Support#FASTQ_Datatype_QA
Thanks!! Jen
Yes, now the problem is solved with Fastq Groomer. Thank you.
But meanwhile another problem arose, Tophat job i start from the last two days and it is worth 500mb roughly. And it's still in the queue. Is there any problem with the server??
Please help me out in this issue. It's urget.
Thank You, Sangram
Expanded help for compressed fastq.gz data (some duplicated with this post, but wanted to link for reference for others): https://biostar.usegalaxy.org/p/21485/#21500
Reported issues with queued job delays at http://usegalaxy.org are being combined into this post, where updates and feedback will be posted as soon as the core issue is investigated and our advice back for the remedy is determined. https://biostar.usegalaxy.org/p/21484
Make sure your 'Datatype' is fastq.gz
Then go to 'Convert Format' and convert it from fastq.gz to fastqsanger
Yes my file in fastq.gz format only. But from which option of "Convert Format" i can do this? I found all but nothing such fastq.gz to fastqsanger is there.
Click 'Edit Attributes', on the top click 'Convert Format' and select convert fastq.gz to fastq
Tophat only takes fastqsanger for some reason so you have to change the datatype afterwards from fastq to fastqsanger
Good clarification! Uncompress first (fastq.gz > fastq), then assign fastqsanger. Or change the datatype to "fastqsanger.gz" directly for fastq.gz datasets without uncompressing.
Be careful to not assign a compressed datatype to an uncompressed dataset, and the reverse. Also make sure data really is in fastqsanger/fastqsanger.gz format before assigning that to uploaded fastq data. Both are very important to assign correctly.
Fastqsanger indicates Sanger Phred +33 quality score scaling. This is interpreted by tools. If the data is a different type (fastqillumina 1.3-1.7), and assigned as fastqsanger, tools may not fail, but the scientific content will be problematic (example: mapping rates will be very low). How to check the fastq type/scaling is covered by the support wiki (link in the other post) for FASTQ data.
If there is a mismatch between the actual format and the assigned datatype, tool problems will result. Either as an error or as poor scientific result quality.