Question: Barcode Splitter On Paired End Illumina Reads
0
gravatar for Veranja Liyanapathirana
5.6 years ago by
Dear Galaxy team   I am so sorry for repeatedly posting the same question, but I do need some inputs in to this.   Please let mek now the best way to use barcode splitter on Paired end Miseq data. The data is already split for the Illumina indexes using Miseq reporter, what I want to do is to split some inhouse barcodes within each of the sample. Barcodes are there in both 5' and 3' end but they are both the same. Please let me know if the best practise is to 1. Join read 1 and two - barcode split and split the two reads 2. Split Read 1 and 2 and them join using FastQ joiner and split again Basically I want to exclude any reads where the same numbered reads are not categorised in to the same barcode.   Thanks   Kind Regards, Veranja   Veranja Liyanapathirana Graduate student Microbiology, CUHK
galaxy • 2.2k views
ADD COMMENTlink modified 5.6 years ago by Jennifer Hillman Jackson25k • written 5.6 years ago by Veranja Liyanapathirana70
0
gravatar for Jennifer Hillman Jackson
5.6 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hello Veranja, Either process is fine - in one you are dividing the groups based on a single match to one end of the pair, in the other based on two independent matches, one to each end of the pair. So the first is less stringent, the second more so. If there are differences in the final result, then this is likely due to sequence quality. If you know the quality is better at the 5' end or vice-versa, then splitting on that single end only might simplify things, but this is your call. A tool like "FastQC" can be useful when assessing quality. Using the joiner tool is one way to quickly synch the files (and discard unmatched parts of pairs). You could also use identifiers as I suggested before. But, this is often not necessary. Instead, many people will go forward with mapping and filter for properly paired mapped data after the alignment step. Our team doesn't recommend a workflow that involves using a join/split method to synch paired inputs prior to mapping (mainly to reduce unnecessary processing/disk space usage, not because it is harmful). Others are welcome to add additional comments/feedback to this thread. Best wishes for your project, Jen Galaxy team -- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org
ADD COMMENTlink written 5.6 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 173 users visited in the last hour