Select FASTQ reads by sequence

Heads up! This is a static archive of our support site. Please go to help.galaxyproject.org if you want to reach the Galaxy community. If you want to search this archive visit the Galaxy Hub search

Latest

Open

RNA-Seq

ChIP-Seq

SNP

Assembly

Forum

Home

Welcome to Galaxy Biostar! User support for Galaxy! about • faq • rss

Log In

Sign Up

Question: Select FASTQ reads by sequence

0

23 months ago by

PaulW • 60

PaulW • 60 wrote:

Which tool should I use to select all reads from a FASTQ file which include any of about 100 short sequences given in a file such as:

ACAGTCAGCTAGCATCGATCCTAGCTAGAC GCATCACGACTACGACGTACATCTAGCATG etc

Is there a tool in Galaxy which will do this?

Alternatively would BBDUK work for this?

fastq • 927 views

ADD COMMENT • link •

modified 23 months ago by Jennifer Hillman Jackson ♦ 25k • written 23 months ago by PaulW • 60

BTW the FASTQ reads are 150 base Illumina reads

ADD REPLY • link written 23 months ago by PaulW • 60

0

23 months ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hello,

The tool Select can pattern match, one pattern at a time, through a regular expression. But that won't be the best solution for this query.

Instead, try a mapping tool such as Lastz. Create a custom reference genome of the 100 query sequences, map the fastq dataset, then filter the output by percent identity and coverage.

Help: https://wiki.galaxyproject.org/Support#Custom_reference_genome

I cannot help you with BBDUK, but perhaps someone else here can, or you can review other online sites (a google brings up much usage discussion).

Thanks, Jen, Galaxy team

ADD COMMENT • link written 23 months ago by Jennifer Hillman Jackson ♦ 25k

Jenn, Thanks for that interesting suggestion. Unfortunately Lastz doesn't process fastq files. Worse, the galaxy implementation of Lastz apparently doesn't expose the "--ambiguous=iupac" command line option so converting fastq to fasta didn't work. I'll keep searching. There's a couple of things in the Toolshed look like they might help.

ADD REPLY • link written 23 months ago by PaulW • 60

Please log in to add an answer.

Similar posts • Search »

Selecting Reads At Random From Fastq File
Hi, I am curious if anyone knows how to select random lines from a fastq file. There is a select...
Subtract
Hello, I am using the subtract (whole dataset) tool. I converted my fastq file to tabular with 2...
Any tools for separate unpaired reads in paired-end sequencing fastq files?
Hi, I would like to know if there is any tool can do the following job? I have some data files ...
Galaxy app to retrieve selected raw reads
There are many previous threads regarding how to retrieve selected raw reads from fastq files off...
Extracting the read counts from a collapsed fasta file?
I have collapsed my fastq file so I know have the output fasta file which contains all the unique...
Batch workflow with two different inputs each time?
Dear Galaxy Community, I have actually installed Galaxy on our cluster, and I am now trying to d...
BWA - History does not include a dataset of the required format
Hello, I tried to aligned paired end reads with BWA (Map with BWA (version 0.2.3)) on my own Gala...
Concatenating four Nextseq fastq files
Hi, Please excuse my naivety but my sequencing was run on NextSeq with single end reads, I have...
FlexBar output appears only as txt ?! I can not use it in worflows ?
Hello, I am using Flexbar on the CRS4 Orione galaxy server. This tool is very useful as it allow...
Primer Contamination, Miranalyzer
Hi Galaxy, Ive got 2 problems for you; 1) Ive got microRNA Illumina NGS data that I want to ana...
questions about read groups
As my prior question didn't seem to get a response, which was wrapped in my OP as a comment, I am...
Fastq Splitter Empty And Fastq Manipulator Doesn'T Work
I am having the same issue as this user: http://user.list.galaxyproject.org/FASTQ-splitter-produ...
Fastq To Bam Conversion On Paired End Reads Picard
Hi I was trying to enter the 2nd fq file into the second dialog box for this tool but then the se...
Rank aptamers by relative frequency
I have the result (fastq file) of the sequencing of DNA aptamers, selected using the SELEX method...
Issue With Saving 'Manipulate Fastq' In Workflow; And Request For Advice Dealing With Barcoded 454 Data
Hi, I'm a new user, learning how to use Galaxy while I wait for my 454 results. So I'm not actua...
Bisulfite Seq
Do we have any tool which can be used for analyzing and QC check for bisulfite sequencing. I have...
Combining The Paired Reads From Illumina Run
Hi, I have two fastq files with the forward(/1) and reverse(/2) paired reads. The reads are not ...
FASTQ Joiner or Interlacer problem
Hello everyone, I am trying to use the FASTQ interlacer or joiner on my pair end read files. Aft...

Content

Help

About
FAQ

Access

RSS
Stats
API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by Biostar version 16.09

Traffic: 178 users visited in the last hour