Multi Fastq joining

Question: Multi Fastq joining

4.5 years ago by

United States

ghliu83 • 0 wrote:

Hi All, how to join multi FASTQ left-hand reads or right-hand reads together? I need map sequence data from different experiments to a common reference, but need first put all left-hand reads or right-hand reads together as one file. Thank you.

assembly tool textmanipulation • 2.7k views

ADD COMMENT • link •

modified 4.5 years ago by Jennifer Hillman Jackson ♦ 25k • written 4.5 years ago by ghliu83 • 0

Thank you. I will try.

ADD REPLY • link written 4.5 years ago by ghliu83 • 0

You state that you want to map paired end reads to a reference. If the reference genome exists, you would probably be well advised to map them as paired end reads to take advantage of the pairing. If the reference genome does not exist, you would have to wonder about the sanity of whoever chose paired end reads - you'd be better off with lots of very long single ended reads to assemble.

As you probably know, most mappers (eg bowtie or bwa) understand paired data and you will gain mapping precision if the mapper deals with each pair as a pair and ensures that both ends of the read map correctly. There may be good reasons to construct a joined file, but mapping to a known reference is not one of them IMHO.

OTOH if you have no reference genome, you might be trying to create a de-novo reference from all available sequence - in which case you may need to try assembling (velvet/abyss etc) all the sequences (all pairs) from all samples - concatenation will work as described below but unless you have a huge amount of sequence, your de-novo reference sequence may have lots of short contigs which don't allow the pairs to map properly.

ADD REPLY • link modified 4.5 years ago • written 4.5 years ago by fubar ♦ 1.1k

I guess what the OP tries to do is not to merge the two pairs in one file - it's just that his words are slightly misleading - instead, I think, he simply has fastq input files like this:

r1.01.fastq, r1.02.fastq, r1.03.fastq

r2.01.fastq, r2.02.fastq, r2.03.fastq

and he would like to join all r1 files into one r1.fastq file and all r2 files into one r2.fastq to then pass these two files to an aligner.

@ghliu83: correct me if I'm wrong.

ADD REPLY • link written 4.5 years ago by Wolfgang Maier • 600

Thank you. That is what I want. I used concatenate datasets tool to join all r1 reads or r2 reads and it worked. Now the mapping is waiting to run.

ADD REPLY • link written 4.5 years ago by ghliu83 • 0

Similar posts • Search »