Question: Which Input Fastq Quality Scores Type Should I Choose When Run Fastq Groomer?
0
gravatar for Du, Jianguang
5.3 years ago by
Du, Jianguang110
Du, Jianguang110 wrote:
Hi All, I downloaded some RNA-seq datasets from NCBI. The datasets were generated by Illumina Hiseq 2000. I am not sure which "Input FASTQ quality scores type" I should choose when run FASTQ Groomer. Below shows the scores of 2 reads of a dataset, I renamed them as "read 1" and "read 2". 1) Sequence and quality score displayed in Galaxy @read 1 length=51 NTGAGATTCTTGACTAGTTATTTCTGCTTTCAGGGAAGAAATCAGCTGGGC +read 1 length=51 #1=ADADEHHHHHIIGIHJGJJJHJIIJJJH@HEGBFH;FHEH>@HIJJJJ @read 2 length=51 NGAAGAGTCAGTTTTTTGTTTCCCTCATAACTTGCTAGATTCCGGATTGCT +read 2 length=51 #1=DDDEDHHFHHJJJJJIJJHIIIJJJIJJJJJJJIJIJJJJJJIJJJJI 2) Sequence and one chanel quality score shown in SRA of NCBI when I downloaded the dataset. NTGAGATTCTTGACTAGTTATTTCTGCTTTCAGGGAAGAAATCAGCTGGGC One channel quality score 2 16 28 32 35 32 35 36 39 39 39 39 39 40 40 38 40 39 41 38 41 41 41 39 41 40 40 41 41 41 39 31 39 36 38 33 37 39 26 37 39 36 39 29 31 39 40 41 41 41 41 NGAAGAGTCAGTTTTTTGTTTCCCTCATAACTTGCTAGATTCCGGATTGCT One channel quality score 2 16 28 35 35 35 36 35 39 39 37 39 39 41 41 41 41 41 40 41 41 39 40 40 40 41 41 41 40 41 41 41 41 41 41 41 40 41 40 41 41 41 41 41 41 40 41 41 41 41 40 Looks like the dataset is generated by illumina that is later than version 1.8 because some of the reads are at score quality of 41. Can I choose "sanger" as "Input FASTQ quality scores type" when I run FASTQ Groomer? Thanks. Jianguang Du
galaxy • 1.7k views
ADD COMMENTlink modified 5.3 years ago by Jennifer Hillman Jackson25k • written 5.3 years ago by Du, Jianguang110
0
gravatar for Jennifer Hillman Jackson
5.3 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hi Jianguang, I agree - already Sanger Phred +33 offset quality scores, meaning you want datatype .fastqsanger (with near certainty). To double check, take a sample and run "FastQC" on it to be exact, or run this tool on the entire dataset if you plan on doing quality checks anyway (potential trimming, etc). You also don't need to run the groomer - just assign the datatype by clicking on the pencil icon. Help is here and the screencast FASTQ Prep walks through a how-to (using SRA data as an example): http://wiki.galaxyproject.org/Support#Dataset_special_cases Hope this helps - but you are really already on the right track, I'm just agreeing! Jen Galaxy -- Jennifer Hillman-Jackson http://galaxyproject.org
ADD COMMENTlink written 5.3 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 176 users visited in the last hour