I am trying to classify a fastq read file with MetaPhIan2 When I run MetaPhIan2, the result is 100% unclassified. Then I transform the fastq to fasta, run MetaPhIan instead, and I get a right classification (100% Cyanobacteria). I assume that there is some problem with the fastq file but do not know which one, since the fastq file is correct and works for assembly and bowtie mapping that I did externally. Any help will be appreciated. Thanks!
Hello,
Double check that your fastq data is in fastqsanger
format. This is the required fastq input format/content for Galaxy-wrapped tools. During Upload, this datatype is automatically assigned when "autodetect" is used and the data is a match for the quality score scaling.
If you are loading the data compressed and preserving that by directly assigning the compressed variation of the same datatype (fastqsanger.gz
), it may not really be in fastqsanger
format. You'll need to confirm that (run FastQC).
- Galaxy FAQs: https://galaxyproject.org/support >> https://galaxyproject.org/support/#getting-inputs-right
- Galaxy Tutorials: https://galaxyproject.org/learn/ >> https://galaxyproject.org/tutorials/ngs/
Thanks! Jen, Galaxy team
Dear Jen
Thank you for your response. When I run fastqc on my data, the format is "Sanger/Illumina 1.9". I am not getting any results both when loading the file specifying "fastqsanger" format or when autodetecting it.
Best,
Javier
If the reads are in fastqsanger, and FastQC doesn't show any data problems (low quality scores), then the next items to review are the parameters between the two runs.
These are different tool wrappers, and different versions of the underlying tool, but you should be able to map many parameters from one to the other. The default settings for one version of the tool may not be the default for the other version. Test out a few runs with different settings and see what results.
Most if not all options are the same as when used line-command, so the tool manuals could be reviewed if you are not sure what each parameter does. If there is a setting(s) that filter results by quality score, length, etc -- make sure your data passes those thresholds or adjust them.
The possibility of a tool install/index problem is what to troubleshoot if you can't get hits. You could report the "no hit" results to the administrators of the server you are working on. Send either a bug report (if active there) or a direct email - check the homepage of the server for contact information.