Question: JBrowse tool crashes on VCF files w/ "No such file or directory" -> dependency issue?
0
gravatar for Merope
15 months ago by
Merope0
Merope0 wrote:

Hello,

I'm using the JBrowse tool maintained by iuc, but trying to visualize VCF files makes it crash. It looks a lot like a missing dependency (much like here, here, here and here), but they're all satisfied. Also, it mentions freebayes which is not in the dependency list. Could it be an incomplete dependency list? Has anybody else run into this problem when visualizing VCF files?

  • jbrowse revision 4e11a688a635
  • all tool dependencies properly installed by conda (i.e. jbrowse, python, biopython, bcbiogff, samtools, pyyaml)
  • Galaxy docker image bgruening/galaxy-stable e86045c4050f
  • Galaxy version 17.05

Tool's full stderr output (stdout is empty):

Fatal error: Exit code 1 ()
INFO:jbrowse:Processing Default / FreeBayes on data 3 and data 39 (variants)
Traceback (most recent call last):
  File "/shed_tools/toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/4e11a688a635/jbrowse/jbrowse.py", line 725, in <module>
    for key in keys:
  File "/shed_tools/toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/4e11a688a635/jbrowse/jbrowse.py", line 616, in process_annotations
    self.add_vcf(dataset_path, outputTrackConfig)
  File "/shed_tools/toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/4e11a688a635/jbrowse/jbrowse.py", line 497, in add_vcf
    self.subprocess_check_call(cmd)
  File "/shed_tools/toolshed.g2.bx.psu.edu/repos/iuc/jbrowse/4e11a688a635/jbrowse/jbrowse.py", line 335, in subprocess_check_call
    subprocess.check_call(command, cwd=self.outdir)
  File "/galaxy-central/database/dependencies/_conda/envs/mulled-v1-4f60687598c40f0675c7de956d89c6ebcb6abb8d1faef576907dbea7330e0778/lib/python2.7/subprocess.py", line 181, in check_call
    retcode = call(*popenargs, **kwargs)
  File "/galaxy-central/database/dependencies/_conda/envs/mulled-v1-4f60687598c40f0675c7de956d89c6ebcb6abb8d1faef576907dbea7330e0778/lib/python2.7/subprocess.py", line 168, in call
    return Popen(*popenargs, **kwargs).wait()
  File "/galaxy-central/database/dependencies/_conda/envs/mulled-v1-4f60687598c40f0675c7de956d89c6ebcb6abb8d1faef576907dbea7330e0778/lib/python2.7/subprocess.py", line 390, in __init__
    errread, errwrite)
  File "/galaxy-central/database/dependencies/_conda/envs/mulled-v1-4f60687598c40f0675c7de956d89c6ebcb6abb8d1faef576907dbea7330e0778/lib/python2.7/subprocess.py", line 1024, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

Since I don't see what it tries to find exactly (or what "it" actually is), I've also tried to strace the job's processes to look for a suspicious syscall to no avail.

Any idea about where to look would help.

Thanks!

software error jbrowse vcf • 759 views
ADD COMMENTlink modified 12 months ago by Jennifer Hillman Jackson25k • written 15 months ago by Merope0
1
gravatar for Merope
15 months ago by
Merope0
Merope0 wrote:

Ok, a keener eye than mine managed to spot something in the strace output:

[...]
    execve("/galaxy-central/database/dependencies/_conda/envs/mulled-v1-4f60687598c40f0675c7de956d89c6ebcb6abb8d1faef576907dbea7330e0778/bin/bgzip", ["bgzip", "data/raw/1670cbf6e5390abdaeb29b7"...], [/* 108 vars */]) = -1 ENOENT (No such file or directory)
    execve("/galaxy_venv/bin/bgzip", ["bgzip", "data/raw/1670cbf6e5390abdaeb29b7"...], [/* 108 vars */]) = -1 ENOENT (No such file or directory)
    execve("/galaxy_venv/bin/bgzip", ["bgzip", "data/raw/1670cbf6e5390abdaeb29b7"...], [/* 108 vars */]) = -1 ENOENT (No such file or directory)
    execve("/usr/local/sbin/bgzip", ["bgzip", "data/raw/1670cbf6e5390abdaeb29b7"...], [/* 108 vars */]) = -1 ENOENT (No such file or directory)
    execve("/usr/local/bin/bgzip", ["bgzip", "data/raw/1670cbf6e5390abdaeb29b7"...], [/* 108 vars */]) = -1 ENOENT (No such file or directory)
    execve("/usr/sbin/bgzip", ["bgzip", "data/raw/1670cbf6e5390abdaeb29b7"...], [/* 108 vars */]) = -1 ENOENT (No such file or directory)
    execve("/usr/bin/bgzip", ["bgzip", "data/raw/1670cbf6e5390abdaeb29b7"...], [/* 108 vars */]) = -1 ENOENT (No such file or directory)
    execve("/sbin/bgzip", ["bgzip", "data/raw/1670cbf6e5390abdaeb29b7"...], [/* 108 vars */]) = -1 ENOENT (No such file or directory)
    execve("/bin/bgzip", ["bgzip", "data/raw/1670cbf6e5390abdaeb29b7"...], [/* 108 vars */]) = -1 ENOENT (No such file or directory)
    stat("/galaxy-central/database/dependencies/_conda/envs/mulled-v1-4f60687598c40f0675c7de956d89c6ebcb6abb8d1faef576907dbea7330e0778/lib/python2.7/subprocess.py", {st_mode=S_IFREG|0664, st_size=49417, ...}) = 0
[...]

One of the processes launched by the job is looking for bgzip and fails to find it anywhere. It's part of the tabix package, the generic indexer for TAB-delimited genome position files, which isn't included in the tool dependencies. It is easily fixed by installing tabix, but ideally it should be included in the deps list.

This question is therefore considered answered as far as I'm concerned. However, I'm having display issues with the resulting track: it's just a series of red boxes with a 404 error. Therefore I will still continue updating this if I find a solution, but in comments to this answer since it's unclear whether it's related to the question at all.

ADD COMMENTlink modified 15 months ago • written 15 months ago by Merope0

Update: still doesn't work. VCF tracks display 404 errors, because the following files are missing int he results:

  • 1670cbf6e5390abdaeb29b71ad82f70b_0.vcf.tbi
  • 1670cbf6e5390abdaeb29b71ad82f70b_0.vcf

Manually did the following as a desperate shot in the dark:

ln -s 1670cbf6e5390abdaeb29b71ad82f70b_0.vcf.gz.tbi 1670cbf6e5390abdaeb29b71ad82f70b_0.vcf.tbi
gunzip 1670cbf6e5390abdaeb29b71ad82f70b_0.vcf.gz

No more 404 error messages but the tracks are empty. Upon closer inspection of the browser logs there is the following error message, repeated a few times, about the same number as the track's "chunks":

invalid BGZF block header, skipping

Oh well. I suppose I did something stupid with the files, then. But why did the tool only do half of the work there?

ADD REPLYlink modified 15 months ago • written 15 months ago by Merope0

Ok, no. I definitely don't get it. Why is it looking for non bgzip'ed files? How does a .vcf.tbi file make sense?

ADD REPLYlink written 15 months ago by Merope0

This is how you set up a vcf in JBrowse.

ADD REPLYlink modified 12 months ago • written 12 months ago by colindaven0

Ok you're confirming that I had no idea what I was doing :'D Thanks for the info!

But then would you agree that getting an error because x.vcf and x.vcf.tbi are missing doesn't make sense?

ADD REPLYlink written 12 months ago by Merope0
0
gravatar for colindaven
12 months ago by
colindaven0
colindaven0 wrote:

Try copying instead of linking ? I have never had good experience with linking ln -s and the webserver user - limited access - behind jbrowse.

Also, are you looking in the right region where data is present ?

ADD COMMENTlink written 12 months ago by colindaven0

I haven't tried copying but looking back on it what I did made little sense in the first place. I may still try it later though. Thanks for the suggestion.

As for the region, it should not matter, as an empty region should not look like a neverending series of red error panels. Here is what it looks like:

neverending error tinsel of death

ADD REPLYlink written 12 months ago by Merope0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 177 users visited in the last hour