Fastq Joiner Fails To Join Pe Data.

Question: Fastq Joiner Fails To Join Pe Data.

6.7 years ago by

Hi, I have HiSeq2000 paired end sequence data in two separate FASTQ files. I need to filter the low quality scored sequences from my data to have a good assembly. So I decided to join the PE reads and then filter the low quality sequences in Galaxy. To do this first I groomed the data using FASTQ groomer where I kept "Sanger" as Input FASTQ quality scores type. Then I tried to join the PE sequences using FASTQ joiner. However the FASTQ joiner did not join the PE sequences but only shown the failure Info as follows *FASTQ joiner on data 8 and data 9* 0 bytes format: fastqsanger, database: ?<https: main.g2.bx.psu.edu="" datasets="" d08dd42f0e2ed22b="" edit=""> Info: There were 4000000 known sequence reads not utilized. Joined 0 of 4000000 read pairs (0.00%). I am a new user and I have no idea where I am going wrong. Please suggest me how to overcome this problem. Thanks. -- ********************************************************************** ********************************************

assembly • 1.1k views

ADD COMMENT • link •

modified 6.6 years ago by Jennifer Hillman Jackson ♦ 25k • written 6.7 years ago by meganathan pr • 10

6.6 years ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hello, The FASTQ Joiner tool is currently being updated to work with the newer sequence Id format. The progress of this change can be tracked here: https://bitbucket.org/galaxy/galaxy-central/issue/677/update-joiner- tool-to-work-with-casava-18 Meanwhile, quality filtering can be done on each file independently, then to synch up the two files (in case sequences are lost in one of the files for quality reasons), a work-around method is: - "NGS: QC and manipulation -> FASTQ to Tabular" on both files - "Join, Subtract and Group -> Join two Datasets" on c1 from both files "Keep lines of first input that do not join with second input:" as yes "Keep lines of first input that are incomplete:" as no "Fill empty columns:" as no - "NGS: QC and manipulation -> Tabular to FASTQ" run twice Recreate both FASTQ files from the same tabular file. The same sequence identifier column will be used in both runs. Hopefully this helps until we have the the regular FASTQ manipulation tools updated, Jen Galaxy team -- Jennifer Jackson http://galaxyproject.org

ADD COMMENT • link written 6.6 years ago by Jennifer Hillman Jackson ♦ 25k

Similar posts • Search »