This is my first rna-seq data analysis project and I am using Galaxy to analyze differential expression on my RNA-seq data comparing 4 treatments. So far I have successfully trimmed my raw data using Trimmomatic, and aligned my reads using HISAT2. Now I am trying to use the htseq-count and was successful with my first few runs, but am now getting an error message for some files. I have tried re-running these several times and still end up with a failed run.
Should I abandon htseq and use another count tool or can I still use the data (that did not fail), discarding the failed files, to continue with my research?
What error message are you getting and from which tool and version (htseq_count)? Are you working at http://usegalaxy.org?
Post the error itself back as a comment with the format as "Code Sample" or share a link to an external site where formatting is preserved, such as a Gist.
The problem is likely a data format/content issue with the fastq input, or htseq_count usage, but let's review the error and decide if that is where to look for the problem. Discarding data would probably be a mistake unless it is totally unusable and it doesn't impact your primary experimental goals.
Thanks! Jen, Galaxy team
I am using htseq-count at http://usegalaxy.org, (Galaxy Version 0.6.1galaxy3).
Code Sample: Fatal error: Unknown error occured 83877 GFF lines processed. Error occured when reading beginning of SAM/BAM file.