Question: suite_samtools_1_2 EOF marker absent error
0
gravatar for rich_jp
3.4 years ago by
rich_jp0
United Kingdom
rich_jp0 wrote:

Hi all, so I'm experiencing a problem running SAM-to-BAM using samtools 1.2 from a toolshed install of suite_samtools_1_2. I have a pipeline that after mapping uses 'SAM-to-BAM', followed by GATK2 'realign target creator/indel realign' and then picard 'mark duplicate reads'. When I run this last picard step I am getting the EOF marker absent error. This problem seems to be confined only to when I run the pipeline with an inbuilt genome build I created (I created all loc files and index files).

I am running the 15_05 stable release from git. Is it possible that I have malformed index files for this genome or is this something to do with the way that Galaxy is now indexing the BAM file?

Any help/suggestions greatly appreciated as I have been stuck with this issue for a few days now.

Cheers, 

Richard

​[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[bam_index_core] Invalid BAM header.[bam_index_build2] fail to index the BAM file.

 

 

picard gatk samtools • 4.5k views
ADD COMMENTlink modified 3.4 years ago • written 3.4 years ago by rich_jp0
0
gravatar for Jennifer Hillman Jackson
3.4 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

Try using the tool SortSAM to coordinate sort the BAM file before running Mark Duplicates. This usage is now true for most Picard/SAMTools functions - coordinate sorted input is required. Uploaded BAM datasets are automatically sorted, but intermediate steps may create BAM files that are not. 

Thanks and please let us know if this does not resolve the issue, Jen, Galaxy team

ADD COMMENTlink written 3.4 years ago by Jennifer Hillman Jackson25k
0
gravatar for rich_jp
3.4 years ago by
rich_jp0
United Kingdom
rich_jp0 wrote:

Hiya, thanks for this. I will try this out and let you know how it goes. What I find strange is that with some genomes the intermediate BAM files do not throw this error (suggesting then they may have been sorted) but others do, despite it always being the same pipeline. I also had noticed that when running SAM-to-BAM I often see this message (or similar) in the out/peek window: [bam_sort_core] merging from 27 files... so I had assumed there were being sorted. 

In any case - will add SortSAM to the pipeline and see if it resolves the issues :)

 

ADD COMMENTlink written 3.4 years ago by rich_jp0

If you convert BAM->SAM, you will find that some tools do sort BAMs upon output (not only sorted, but have the correct "sorted" header). But then other intermediate tools can alter this. Sorting is the best option if not sure or if a problem comes up. When you workflow this, the intermediate sort files or the original output can be hidden (and also perm deleted) to manage data usage in your account. Thanks, Jen

ADD REPLYlink written 3.4 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 175 users visited in the last hour