Question: Issue With Saving 'Manipulate Fastq' In Workflow; And Request For Advice Dealing With Barcoded 454 Data
0
Pip Griffin • 60 wrote:
Hi,
I'm a new user, learning how to use Galaxy while I wait for my 454
results.
So I'm not actually playing with any data yet but I'm trying to set up
a
draft workflow as practice. Two issues:
Issue 1.
I am having trouble with the 'manipulate fastq' command. Without this,
my
workflow saves quickly and seems fine, but when I include even a
(seemingly
simple) 'manipulate fastq' step, it tries to save for many minutes,
unsuccessfully, until I get sick of it and close the window.
Issue 2.
Well this isn't really an issue, just a request for advice! My dataset
will
be a barcoded amplicon library, containing 8 different gene regions
(which I
can recognise from the amplicon-specific primer sequences) amplified
in 64
different individuals (which I can recognise by an individual-specific
barcode sequence). I thought I'd set up a workflow with the following
steps:
1) convert to FASTQ format. 2) grooming, filtering to remove short
reads
etc. 3) 'manipulate FASTQ' to match all sequences containing one of
the
eight reverse primer sequences, and reverse-complement them. 4)
FASTQ--tabular format conversion. 5) eight separate 'select' steps to
select
sequences with a match to either the forward primer or the
reverse-complemented reverse primer of the desired gene region.
My question is: does this seem sensible? Is there a more efficient way
to do
this that I haven't discovered yet? I was thinking I'd then set up
another
workflow to label barcoded individuals, for I could use each of the
eight
gene 'output files' in turn as input.
Thanks so much for this service! The screencasts are especially great.
Pip Griffin
University of Melbourne, Australia