Question: "Tool error: Unable to finish job" on the DOCKER version of galaxy, but not for the usual reasons (missing dependencies) nor is it consistent
0
gravatar for Thon deBoer
6 months ago by
Thon deBoer20
Thon deBoer20 wrote:

I created a wrapper for the NEAT-genReads NGS simulation tool but I am running into the dreaded "Tool error: Unable to finish job", but only on ecoli-size genomes, only on the DOCKER version of Galaxy and NOT because of missing dependencies (which is the usual reason for this error, as I see in other posts on this subject).

I AM able to successfully run the tool on a small genome (chrMT of hg19) so there are no missing dependencies etc. I don't see ANYTHING in the logs (I set -e "GALAXY_LOGGING=full" when starting the docker image) not is there anything in the stdout, stderr nor in the error log of the tool itself...

I am also able to run the tool successfully on the PLANEMO Virtual Appliance that I use for the development and testing of the wrapper

Here's how it looks on the DOCKER version: enter image description here

And here's how it looks on the Planemo version (same file Ecoli genome) enter image description here

The likely culprit is the BAM file, since if I choose NOT to create the BAM file, it DOES correctly finish on the DOCKER version, but I double checked that samtools is installed on the DOCKER image (it obviously is) so can't be that. I am stumped and hope that someone can help me figure out why the DOCKER version is not working, while on the PLANEMO vm it is working fine...

No..It's not diskspace on /tmp neither...

Thanks!

planemo software error docker • 333 views
ADD COMMENTlink modified 5 months ago • written 6 months ago by Thon deBoer20

Hello, I've asked the developers to help. Meanwhile, could you comment back with the version/source of Planemo and Docker that you are comparing? Thanks! Jen, Galaxy team

ADD REPLYlink written 6 months ago by Jennifer Hillman Jackson25k

Not sure how to check for version numbers on the Galaxy I am using on Docker and Planemo. I looked in the Planemo build and see a file named:

release_17.05_receipt

So, guessing Planemo version is 17.05

Could not find something similar for the Docker version, but the git entry TOC goes up to 17.09

https://github.com/bgruening/docker-galaxy-stable

So, neither of them are using 18 I think

ADD REPLYlink written 6 months ago by Thon deBoer20
1
gravatar for jmchilton
6 months ago by
jmchilton1.1k
United States
jmchilton1.1k wrote:

Is the standard output and error for the datasets the same between the Docker and Planemo machine runs? It could be a difference with the version of Galaxy - if you are using a new (say 18.01) Docker container that is going to have different BAM handling than for instance a Galaxy running 17.09 which would be my guess for Planemo.

I'm not sure I understand BAM sorting, but here is a blurb from the 18.01 release notes:

Previously Galaxy only supported coordinate sorted BAM files by default (the bam datatype). In addition, this release of Galaxy now supports three new types of BAM:

qname_sorted.bam, that ensures that the file is queryname sorted (e.g. SO:queryname);
qname_input_sorted.bam, that can be used to describe the output of aligners which generally keep mate pairs adjacent
unsorted.bam, that makes no assumptions about the sort order of the file.
A huge thanks goes out to @bgruening and @mvdbeek who implemented these datatypes.

Is it possible you are producing files that should be considered to have a different ordering in Galaxy 18.01 or newer?

ADD COMMENTlink written 6 months ago by jmchilton1.1k

I updated to the latest Docker version (can't really see what version that is, but guessing 18.something since it had "unsorted.bam as an option).

I changed the BAM file format type from "bam" to "unsorted.bam" and this seems to fix the error!

Next I just made samtools sort the BAM files to ensure it is really sorted BUT that did not solve the issue.

So it has something to do with the datatype of the BAM file and while I don't understand why samtools sort does not work, I think I am going in the right direction...More later

ADD REPLYlink modified 6 months ago • written 6 months ago by Thon deBoer20
1
gravatar for Thon deBoer
5 months ago by
Thon deBoer20
Thon deBoer20 wrote:

Seems that that neat_genReads tool has a known issue with BAM files not being properly formatted...

Get this when trying to read the BAM file

[E::bam_read1] CIGAR and query sequence lengths differ for ecoli-chr-268539/1 [main_samview] truncated file.

So, this is NOT a problem with Galaxy and I think the reason that Planemo version has no issues, is that this is a random issue with the neat_genReads tool and it just happens to happen on the docker version..

This is a link to the issue on GITHUB for neat_genReads.

ADD COMMENTlink written 5 months ago by Thon deBoer20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 177 users visited in the last hour