Question: Deconvoluting Ngs Samples With Multiple Barcodes
0
Pip Griffin • 60 wrote:
Hi,
I have a sequence file that has 454 reads for 64 barcoded samples.
I also have a second 'query' which is a file with the names of all 64
samples, and the corresponding 'sample identifier sequence' (19 bp) in
the following format:
(AGGTTGATTGAATGGCTTA)|(GATGAAGAACGCAGAACCT)
(I need to search for the forward or the reverse identifier).
I want to 'join' the two queries by searching for a match in the first
query with the 'sample identifying sequence' in the second query so
that I end up with a copy of the first query with a new column
corresponding to the sample names.
But the 'join' command only returns perfect matches between columns.
How can I join two queries with a partial match? (obviously only 19bp
of the total result sequence will overlap with the identifying
sequence)
I could use the 'manipulate fastq' command, but I would have to do 64
separate steps, as far as I can tell.
I would really appreciate any help with what is probably quite a
simple problem!
thanks very much
Pip Griffin