Question: Fasta dataset contains non-ATCGN characters and fails with FastQC
0
gravatar for jay
5 months ago by
jay20
jay20 wrote:

Hi,

I am getting this error on job submission. it is just a subset of the error. Not all files gives me this error. How to resolve this?

File "/opt/galaxy-user/galaxy-17.09-github/.venv/lib/python2.7/site-packages/MySQLdb/connections.py", line 208, in unicode_literal return db.literal(u.encode(unicode_literal.charset)) UnicodeEncodeError: 'latin-1' codec can't encode character u'\u2018' in position 117: ordinal not in range(256)

ADD COMMENTlink modified 5 months ago by Jennifer Hillman Jackson25k • written 5 months ago by jay20

Are you using quotation marks in those file names?

ADD REPLYlink modified 5 months ago • written 5 months ago by skhan40

Hi, The filename doesn't have quotation marks

ADD REPLYlink written 5 months ago by jay20
1
gravatar for Jennifer Hillman Jackson
5 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

This looks like an input formatting problem.

  1. Make sure the input is nucleotide content and not protein (amino acids).
  2. Check the fasta content for non-ATCGN content. IUPAC characters will cause problems.
  3. Ensure the datatype assigned is correct. Note that fastqsanger, fastq, fasta, fastqsanger.gz, fastq.gz, fasta.gz are all distinct datatypes - and must match the input datatype compression (if used).

Support FAQs: https://galaxyproject.org/support/#troubleshooting

Thanks! Jen, Galaxy team

ADD COMMENTlink modified 5 months ago • written 5 months ago by Jennifer Hillman Jackson25k
1

Thanks Jennifer the file contains invalid charaters

ADD REPLYlink written 5 months ago by jay20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 115 users visited in the last hour