Fastq Joiner Problem

Question: Fastq Joiner Problem

7.1 years ago by

United States

Hi, I am trying to join two groomed fastq files from a paired-end Illumina read using the fastq joiner tool. The drop-down menus correctly identify the groomed fastq files, but after cranking for a few minutes the tool produces empty output: "FASTQ joiner on data 5 and data 4emptyformat: fastqsanger, database: ?Info: There were 3497909 known sequence reads not utilized.Joined 0 of 3497909 read pairs (0.00%)." The files have the same number of reads (3497909), reads have the same number of bases (102), and the joiner tool doesn't have any options (other than choosing the two files to join). I have tried this with Sanger and with Illumina 1.3+ quality scores, and in both (left-right) orientations. I've pasted the beginnings of the two files below my signature in case this is useful for diagnosing the problem. Can anyone tell me what I'm doing wrong? Thanks, -- Matthew D. Herron, PhD Department of Zoology University of British Columbia X.princeps@gmail.com http://www.eebweb.arizona.edu/grads/mherron/ Sample of read 1: @HWI-ST765:83:D091AACXX:1:1101:1202:2130 1:N:0:ACTTGA TTATTCCGTTTACCTTCACGCTGTTATGGCTCTCGGTGTTCGGCAATAGCGCGCTGTATGAAATTATCCA CGGCGGCGCGGCATTTGCCGAGGAAGCGATG + CCCFFFFFHHHHHJJJJJJJJJJJJIJIJJJIEHII?FGIIJJJJJJJIJJJHFFDDEEEDEDDDDEDDD DDDDDDDDDDDDDDDD@CCD3<bdd@dddbd sample="" of="" read="" 2:="" @hwi-st765:83:d091aacxx:1:1101:1202:2130="" 2:n:0:acttga="" attccccagcaccagcgccccggagtccgccgaggtcacataaaacagcaggccagtaatggtggcgacg="" gaggcgctaaaggtaaacgccggatactgcg="" +="" cccffffffghhhjjjjjjjjjijifhiijjjjig:beffffeeeddcbddddddd@cdddcacdddbdd="" ddddbdddddddc="">@CCC@@DDDDDDDDEDB

• 2.0k views

ADD COMMENT • link •

modified 7.1 years ago by Jennifer Hillman Jackson ♦ 25k • written 7.1 years ago by Matthew Herron • 20

7.1 years ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hello Matthew, Lucinda is correct, this tool does not interpret the new ID format correctly. I have opened a bitbucket ticket to track the issue: https://bitbucket.org/galaxy/galaxy-central/issue/677/update-joiner- tool-to-work-with-casava-18 For now, there is a work-around: 1 - Make certain to run the FASTQ Groomer with input quality scores set to "Sanger" and leave the rest of the form options as default. 2 - Use the tool "NGS: QC and manipulation -> FASTX-Toolkit for FASTQ data -> Rename sequences" to set the sequence names as "numeric". Do #1 & #2 for each file. 3 - Run Joiner with the file orders as appropriate for left/right. Thanks for reporting the issue! Best, Jen Galaxy team -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support

ADD COMMENT • link written 7.1 years ago by Jennifer Hillman Jackson ♦ 25k

7.1 years ago by

Lucinda Lawson • 20

Lucinda Lawson • 20 wrote:

I'm having the same issue (though with interlacer). I suspect that it's an issue with the way the forward and reverse are read. Mine look like yours, where forward is 1:N and reverse is 2:N instead of the /1 and /2 that the tool says that it expects. Our data are Illumina pipeline 1.9 (HiSeq), so maybe that's the problem? I don't actually know, however, or how to fix this. Just interesting to have run into this problem today and then seen your email. -Lucinda -- Lucinda Lawson Postdoctoral Research Computational Biologist USDA-ARS Gainesville, FL

ADD COMMENT • link written 7.1 years ago by Lucinda Lawson • 20

Please log in to add an answer.

Similar posts • Search »