23 months ago by
United States
Hello,
The tool Select can pattern match, one pattern at a time, through a regular expression. But that won't be the best solution for this query.
Instead, try a mapping tool such as Lastz. Create a custom reference genome of the 100 query sequences, map the fastq dataset, then filter the output by percent identity and coverage.
Help: https://wiki.galaxyproject.org/Support#Custom_reference_genome
I cannot help you with BBDUK, but perhaps someone else here can, or you can review other online sites (a google brings up much usage discussion).
Thanks, Jen, Galaxy team
BTW the FASTQ reads are 150 base Illumina reads