Question: "Unable to finish job" when using discover_datasets in XML
0
gravatar for thomas.nigel.lawson
13 months ago by
thomas.nigel.lawson10 wrote:

I was wondering if you could help with a problem I was having using the discover_datasets option for getting dynamically named data output. It seems that it doesn’t work on our High Performance Compute but does on standard computers. Any ideas?

The code I am using is this:

<data name="decon_targets" label="${tool.name} on ${on_string}: targets" format="tsv">
<discover_datasets pattern="(?P<designation>.+)_target\.tsv" directory="." visible="true" format="tsv" />
<filter>technology == "dims"</filter>
</data>

And I get the error Unable to finish job

Traceback (most recent call last):
  File "/gpfs/apps/galaxy/viantm-dev/galaxy/lib/galaxy/jobs/runners/__init__.py", line 630, in finish_job
    job_state.job_wrapper.finish( stdout, stderr, exit_code )
  File "/gpfs/apps/galaxy/viantm-dev/galaxy/lib/galaxy/jobs/__init__.py", line 1374, in finish
    'primary': self.tool.collect_primary_datasets(out_data, tool_working_directory, input_ext, input_dbkey)
  File "/gpfs/apps/galaxy/viantm-dev/galaxy/lib/galaxy/tools/__init__.py", line 1613, in collect_primary_datasets
    return output_collect.collect_primary_datasets( self, output, job_working_directory, input_ext, input_dbkey=input_dbkey )
  File "/gpfs/apps/galaxy/viantm-dev/galaxy/lib/galaxy/tools/parameters/output_collect.py", line 325, in collect_primary_datasets
    primary_data.set_meta()
  File "/gpfs/apps/galaxy/viantm-dev/galaxy/lib/galaxy/model/__init__.py", line 2045, in set_meta
    return self.datatype.set_meta( self, **kwd )
  File "/gpfs/apps/galaxy/viantm-dev/galaxy/lib/galaxy/datatypes/tabular.py", line 976, in set_meta
    data_row = next(reader)
StopIteration
discover_datasets galaxy xml • 386 views
ADD COMMENTlink modified 13 months ago • written 13 months ago by thomas.nigel.lawson10
1
gravatar for thomas.nigel.lawson
13 months ago by
thomas.nigel.lawson10 wrote:

Found the solution to this.

It seems one of the output files was causing some problems.

The output file was a .tsv file with a single row of different Windows directories. I updated to give the file column headers and put the directories within apostrophes and it fixed it!

After a bit more testing I found that I was always getting this error when I had any tsv file consisting of 1 row without column headers

ADD COMMENTlink modified 13 months ago • written 13 months ago by thomas.nigel.lawson10

Our devs are going to test to see if we can reproduce this it. If so, this would be a bug in the datatype sniffer we'll want to fix. Thanks for the following up!!

ADD REPLYlink written 13 months ago by Jennifer Hillman Jackson25k
1

Great thanks.

I had the same problem with using just the from_work_dir option as well

    <data name="target" label="${tool.name} on ${on_string}: target"
          from_work_dir="target.tsv" format="tsv">
        <filter>technology == "dims"</filter>
    </data>

Again , I was always getting the error when I had any tsv file consisting of 1 row without column headers. Added column headers and it was fixed.

ADD REPLYlink modified 13 months ago • written 13 months ago by thomas.nigel.lawson10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 178 users visited in the last hour