Question: Fasta dataset contains non-ATCGN characters and fails with FastQC
gravatar for jay
17 days ago by
jay20 wrote:


I am getting this error on job submission. it is just a subset of the error. Not all files gives me this error. How to resolve this?

File "/opt/galaxy-user/galaxy-17.09-github/.venv/lib/python2.7/site-packages/MySQLdb/", line 208, in unicode_literal return db.literal(u.encode(unicode_literal.charset)) UnicodeEncodeError: 'latin-1' codec can't encode character u'\u2018' in position 117: ordinal not in range(256)

ADD COMMENTlink modified 15 days ago by Jennifer Hillman Jackson24k • written 17 days ago by jay20

Are you using quotation marks in those file names?

ADD REPLYlink modified 16 days ago • written 16 days ago by skhan30

Hi, The filename doesn't have quotation marks

ADD REPLYlink written 15 days ago by jay20
gravatar for Jennifer Hillman Jackson
15 days ago by
United States
Jennifer Hillman Jackson24k wrote:


This looks like an input formatting problem.

  1. Make sure the input is nucleotide content and not protein (amino acids).
  2. Check the fasta content for non-ATCGN content. IUPAC characters will cause problems.
  3. Ensure the datatype assigned is correct. Note that fastqsanger, fastq, fasta, fastqsanger.gz, fastq.gz, fasta.gz are all distinct datatypes - and must match the input datatype compression (if used).

Support FAQs:

Thanks! Jen, Galaxy team

ADD COMMENTlink modified 23 hours ago • written 15 days ago by Jennifer Hillman Jackson24k

Thanks Jennifer the file contains invalid charaters

ADD REPLYlink written 1 day ago by jay20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 71 users visited in the last hour