Question: htseq-count outputting 3 files
2
gravatar for encarnaciona
7 months ago by
encarnaciona20
encarnaciona20 wrote:

Hello, I running an htseq using some HISAT2 files. I'm noticing this time through that the htseq-count is outputting 3 jobs. The first, a (no feature), and a (BAM) output. I didn't have the (BAM) output the first time I ran through samples with this server. I am also noticing that an error is arising with the metadata in the (BAM) output.

How do I resolve this issue?

Thanks!

rna-seq bam file htseq • 377 views
ADD COMMENTlink modified 7 months ago by Jennifer Hillman Jackson25k • written 7 months ago by encarnaciona20
0
gravatar for Jennifer Hillman Jackson
7 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

The tool version with problems is older. This older version is now outputting the extra bam file even if it is not specified to be output (this is a bug) -- the updated tool version processes this setting/output correctly.

Best tool version to use: htseq-count - Count aligned reads in a BAM file that overlap features in a GFF file (Galaxy Version 0.6.1galaxy3). The extra BAM output toggle is done with the parameter option "Additional BAM Output". If you choose to output this, a reference dataset is required -- either built-in "Locally cashed" or from the History (your own custom genome fasta).

If a BAM dataset has metadata problems (from any tool), it usually means that there was no output in most cases or there was an input format/content problem. This is part of the bug with the earlier tool version -- the BAM is output even when not specified, with no reference genome selection made, resulting in an empty result that cannot be indexed (leading to the metadata warning).

Choices:

  • Use the latest tool. This is the best solution. It includes bug fixes/enhancements and a restructured tool form.
  • Use the older tool and either ignore or purge (permanently delete) the empty BAM output. It has no content, so does not count toward account quota usage.
  • Use the older tool and set the option to output the BAM correctly (avoids the BAM metadata problem). The option is under "Set advanced options >> Set advanced options >> Additional BAM Output". Choose to output the BAM and the form will refresh allowing the selection of a reference genome.

Thanks, Jen, Galaxy team

ADD COMMENTlink modified 7 months ago • written 7 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 168 users visited in the last hour