Question: Fastq Joiner Problem
7.1 years ago
United States
Matthew Herron20 wrote:
Hi, I am trying to join two groomed fastq files from a paired-end Illumina read using the fastq joiner tool. The drop-down menus correctly identify the groomed fastq files, but after cranking for a few minutes the tool produces empty output: "FASTQ joiner on data 5 and data 4emptyformat: fastqsanger, database: ?Info: There were 3497909 known sequence reads not utilized.Joined 0 of 3497909 read pairs (0.00%)." The files have the same number of reads (3497909), reads have the same number of bases (102), and the joiner tool doesn't have any options (other than choosing the two files to join). I have tried this with Sanger and with Illumina 1.3+ quality scores, and in both (left-right) orientations. I've pasted the beginnings of the two files below my signature in case this is useful for diagnosing the problem. Can anyone tell me what I'm doing wrong? Thanks, -- Matthew D. Herron, PhD Department of Zoology University of British Columbia Sample of read 1: @HWI-ST765:83:D091AACXX:1:1101:1202:2130 1:N:0:ACTTGA TTATTCCGTTTACCTTCACGCTGTTATGGCTCTCGGTGTTCGGCAATAGCGCGCTGTATGAAATTATCCA CGGCGGCGCGGCATTTGCCGAGGAAGCGATG + CCCFFFFFHHHHHJJJJJJJJJJJJIJIJJJIEHII?FGIIJJJJJJJIJJJHFFDDEEEDEDDDDEDDD DDDDDDDDDDDDDDDD@CCD3<bdd@dddbd sample="" of="" read="" 2:="" @hwi-st765:83:d091aacxx:1:1101:1202:2130="" 2:n:0:acttga="" attccccagcaccagcgccccggagtccgccgaggtcacataaaacagcaggccagtaatggtggcgacg="" gaggcgctaaaggtaaacgccggatactgcg="" +="" cccffffffghhhjjjjjjjjjijifhiijjjjig:beffffeeeddcbddddddd@cdddcacdddbdd="" ddddbdddddddc="">@CCC@@DDDDDDDDEDB
7.1 years ago
United States
Jennifer Hillman Jackson25k wrote:
Hello Matthew, Lucinda is correct, this tool does not interpret the new ID format correctly. I have opened a bitbucket ticket to track the issue: tool-to-work-with-casava-18 For now, there is a work-around: 1 - Make certain to run the FASTQ Groomer with input quality scores set to "Sanger" and leave the rest of the form options as default. 2 - Use the tool "NGS: QC and manipulation -> FASTX-Toolkit for FASTQ data -> Rename sequences" to set the sequence names as "numeric". Do #1 & #2 for each file. 3 - Run Joiner with the file orders as appropriate for left/right. Thanks for reporting the issue! Best, Jen Galaxy team -- Jennifer Jackson
7.1 years ago
Lucinda Lawson20 wrote:
I'm having the same issue (though with interlacer). I suspect that it's an issue with the way the forward and reverse are read. Mine look like yours, where forward is 1:N and reverse is 2:N instead of the /1 and /2 that the tool says that it expects. Our data are Illumina pipeline 1.9 (HiSeq), so maybe that's the problem? I don't actually know, however, or how to fix this. Just interesting to have run into this problem today and then seen your email. -Lucinda -- Lucinda Lawson Postdoctoral Research Computational Biologist USDA-ARS Gainesville, FL
