Question: Have Fsatq ... But Don'T Know Input Format For Groomer?
0
gravatar for Johnson, Kory (NIH/NINDS) [C]
7.8 years ago by
Hello, My account login is: johnsonko@ninds.nih.gov<mailto:johnsonko@ninds.nih.gov> I am a first time Galaxy user. I have uploaded my sequences as format "fastq" into Galaxy and would like to next use "Groomer" to output Sanger fastq format so to go on with exploring quality via box plot, deciding on a trim length (if any), and map to genome using bwa or bowtie. However, I am running into a problem using "Groomer". I do not know what format my sequences are per setting the required input parameter. An example of my sequences is as follows: @SNPSTER6_0679:1:1:1083:939#0/1 run=100908_SNPSTER6_0679_70929AAXX NATTTATGGATAGTTGGGTAGTAGGTGTAAATGTATGTGGTAAAAGGCCTAGGAGATTTGTTGATCCAAT AAATATGATTAGGGAAACAA +SNPSTER6_0679:1:1:1083:939#0/1 BIQQIQQQTP[[[[[VVVVQPPPPPTWWWW[[YYTTTOVV____TWVXRWPTQPQWWWWWTOOVV___V_ TROOWTWTWTQWQWTTRWRO ... how to tell if you have: "Sanger", "Solexa", "Illumina 1.3+", etc. I have tried to submit to "Groomer" different times using these options one at a time and none return with results. Need help please. Also, what is the expected time for "Groomer" to return results for a file containing 2.7 million reads. Thank you ... best, Kory Kory R. Johnson, MS, PhD Sr. Bioinformatics Scientist [cid:image001.jpg@01CBC2E0.B2CEC7F0] www.kellygovernmentsolutions.com Providing Contract Services For: Bioinformatics Section, Information Technology & Bioinformatics Program, Division of Intramural Research (DIR), National Institute of Neurological Disorders & Stroke (NINDS), National Institutes of Health (NIH), Bethesda, Maryland Mailing Address: NINDS/NIH Clinical Center (Building 10) Office 5S223 9000 Rockville Pike Bethesda, MD 20892 Contact Information: Phone: 301-402-1956 Fax: 301-480-3563 email: johnsonko@ninds.nih.gov P Green Message: Please consider the environment before printing this e-mail. Thank you. Important Message: This electronic message transmission contains information intended for the recipient only. Such that, the information contained herein may be confidential, privaledged, or proprietary. If you are not the intended recipient, be aware that any disclosure, copying, distribution, or use of this information is strictly prohibited. If you have received this electronic information in error, please notify the sender immediately by telephone. Thank you.
bwa alignment bowtie • 1.0k views
ADD COMMENTlink modified 7.8 years ago by Daniel Blankenberg ♦♦ 1.7k • written 7.8 years ago by Johnson, Kory (NIH/NINDS) [C]50
0
gravatar for Daniel Blankenberg
7.8 years ago by
Daniel Blankenberg ♦♦ 1.7k
United States
Daniel Blankenberg ♦♦ 1.7k wrote:
Hi Kory, The problem with this FASTQ block is that the sequence and quality score identifier lines do not match ('SNPSTER6_0679:1:1:1083:939#0/1 run=100908_SNPSTER6_0679_70929AAXX' vs 'SNPSTER6_0679:1:1:1083:939#0/1'), where the identifier for the sequence line has additional text not found on the identifier for the quality score line, which is not valid for the FASTQ format. Alternatively the quality score identifier line could be only a '+', without the sequence identifier. The quality score lines appear to be either illumina or solexa, but it is best to check with the source of the data to be sure: Input ASCII range: 'B'(66) - '_'(95) Input decimal range: 2 - 31 You'll need to upload valid FASTQ files inorder to work with them in Galaxy. Correct examples of your provided read are: @SNPSTER6_0679:1:1:1083:939#0/1 NATTTATGGATAGTTGGGTAGTAGGTGTAAATGTATGTGGTAAAAGGCCTAGGAGATTTGTTGATCCAAT AAATATGATTAGGGAAACAA +SNPSTER6_0679:1:1:1083:939#0/1 BIQQIQQQTP[[[[[VVVVQPPPPPTWWWW[[YYTTTOVV____TWVXRWPTQPQWWWWWTOOVV___V_ TROOWTWTWTQWQWTTRWRO or @SNPSTER6_0679:1:1:1083:939#0/1 run=100908_SNPSTER6_0679_70929AAXX NATTTATGGATAGTTGGGTAGTAGGTGTAAATGTATGTGGTAAAAGGCCTAGGAGATTTGTTGATCCAAT AAATATGATTAGGGAAACAA + BIQQIQQQTP[[[[[VVVVQPPPPPTWWWW[[YYTTTOVV____TWVXRWPTQPQWWWWWTOOVV___V_ TROOWTWTWTQWQWTTRWRO or @SNPSTER6_0679:1:1:1083:939#0/1 run=100908_SNPSTER6_0679_70929AAXX NATTTATGGATAGTTGGGTAGTAGGTGTAAATGTATGTGGTAAAAGGCCTAGGAGATTTGTTGATCCAAT AAATATGATTAGGGAAACAA +SNPSTER6_0679:1:1:1083:939#0/1 run=100908_SNPSTER6_0679_70929AAXX BIQQIQQQTP[[[[[VVVVQPPPPPTWWWW[[YYTTTOVV____TWVXRWPTQPQWWWWWTOOVV___V_ TROOWTWTWTQWQWTTRWRO Please let us know if we can be of further assistance. Thanks for using Galaxy, Dan
ADD COMMENTlink written 7.8 years ago by Daniel Blankenberg ♦♦ 1.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 172 users visited in the last hour