Question: Question Regarding Quality Filtering Of 454 Amplicons
0
Jackie Lighten • 20 wrote:
Hi,
I have a question for you guys regarding quality filtering.
I have a data set of double MID tagged 454 amplicons, from which I
wish to
select high quality sequences above Q20.
The 454 quality filtering system seems to work differently from that
given
for the Illumina sequencing i.e. 454 filtering takes high quality
segments,
while Illumina (FASTQ) can select high quality full reads based on
certain
parameters.
OK, so I know that the total length of my amplicon, including primers
and
barcodes is around 260bp. If I then set the 454 quality filtering tool
to
extract contiguous high quality sequence of >260, it gives me back
around
45% of my raw data as hitting this criterion i.e. All 260bp are above
Q20. I
donšt necessarily need this high stringency as most bases may not be
informative.
But if I convert my 454 data to FASTQ format and then run the Illumina
filtering system which also allows me to set the number of bases
allowed to
deviate from the Q20 criteria, I get back over 90% of my data
(allowing 10bp
to deviate from Q20).
I then need to go ahead and convert back to 454 format.
Can you tell me if this is OK?
Will I loose /confuse information somewhere along these conversions?
It seems that if I do this, my barcodes are removed, as amplicons do
not
sort properly when I parse them through my barcode filtering program.
Does anyone know of a program to filter 454 data based on average
sequence
quality score, which doesnšt involve Linux and the Roche off
instrument
program (I have no experience in Linux! )
Thanks!
--
Jack Lighten,
Ph.D. Candidate,
Bentzen Lab,
Room 6078,
Department of Biology,
Dalhousie University,
Halifax, NS, B3H 4J1
Canada
Office:(902) 494-1398
Email: Jackie.Lighten@Dal.Ca
Profile: www.marinebiodiversity.ca/CHONe/Members/lightenj/profile/bio
ADD COMMENT
• link
•
modified 7.4 years ago
by
Jennifer Hillman Jackson ♦ 25k
•
written
7.7 years ago by
Jackie Lighten • 20