Question: Adding new DataProvider for JSON
0
gravatar for asmariyaz23
3.4 years ago by
asmariyaz2310
United States
asmariyaz2310 wrote:

My goal is to read a JSON file from History into a JS script (for visualization puposes). The mako template where I read in the JSON from hda is as below:

<script src='/plugins/visualizations/igv/static/MutationInspectorWeb.js'></script>
<script>
   <%
     datalist = hda.datatype.dataprovider( hda, 'json' )
   %>
   var pipeline_inspector = ${h.to_json_string( datalist, indent=2 )};
   readData( pipeline_inspector );
</script>

In galaxy/lib/galaxy/datatypes/text.py:

from galaxy.datatypes import dataproviders
from galaxy.datatypes.dataproviders.dataset import JsonDataProvider
@dataproviders.decorators.has_dataproviders
class Json( Text ):
....
@dataproviders.decorators.dataprovider_factory( 'json', JsonDataProvider.settings )
def json_dataprovider( self, dataset, **settings ):
         dataset_source = dataproviders.dataset.DatasetDataProvider( dataset )
         return JsonDataProvider( dataset_source, **settings )

In galaxy/lib/galaxy/datatypes/dataproviders:

class JsonDataProvider( base.DataProvider ):
       def __init__( self, dataset, **kwards ):
            self.dataset = dataset

       def __iter__( self ):
           if (self.dataset in not None):
              yield [list(line) for line in self.dataset]
           else:
              yield  

I end up returning a module and not a list of strings. The traceback is as below:

AttributeError: 'module' object has no attribute 'to_json_string'. MutationInspector.js requires a JSON object to run correctly. Can anyone help me what I can do differently to return a JSON object or string?

galaxy dataprovider json • 1.3k views
ADD COMMENTlink modified 3.4 years ago • written 3.4 years ago by asmariyaz2310
2
gravatar for carlfeberhard
3.4 years ago by
carlfeberhard390
United States
carlfeberhard390 wrote:

Hello and awesome job so far!

I believe what's happening is the 'Json' string used to request a new provider:

data = list( hda.datatype.dataprovider( hda, 'Json' ) )

does not match the 'json' string you've used to define the provider:

@dataproviders.decorators.dataprovider_factory( 'json', JsonDataProvider.settings )
ADD COMMENTlink written 3.4 years ago by carlfeberhard390

I was able to move forward from No DataProvider issue, but ended up having an AttributeError. I have updated my original post to reflect on the stage that I am. If you could take a look and comment- that will be great!

ADD REPLYlink written 3.4 years ago by asmariyaz2310

The documentation is outdated. I believe you can do that now with:

var pipeline_inspector = ${h.dumps( datalist, indent=2 )};

Just from the name pipeline_inspector, I'm assuming you're using this pattern for debugging (at least currently). If it wasn't debugging, keep in mind the thing about json in particular: if you're not filtering the results, a dataprovider may not be necessary. 

Dataproviders were meant to operate on non-json files and use json as a medium of communication between a dataset and a visualization. Also, since JSON is hierarchically structured, the iterator pattern used by dataproviders may not make sense.

I'll see what I can do about making a json-specific provider available in the core - since it's a bit of odd situation.

In the meantime, another way you might make a provider directly in the datatypes/text.py file is:

from galaxy.datatypes import dataproviders

....

@dataproviders.decorators.has_dataproviders
class Json( Text ):
    edam_format = "format_3464"
    file_ext = "json"

...

    @dataproviders.decorators.dataprovider_factory( 'raw', {} )
    def raw_dataprovider( self, dataset ):
        lines = dataproviders.dataset.DatasetDataProvider( dataset )
        yield json.loads( ''.join( lines ) )

This will open the file, join the lines in it, and load it into a python dictionary. 

And then, in your visualization mako file:

var pipeline_inspector = ${ h.dumps( hda.datatype.raw_dataprovider( hda ).next(), indent=2 ) };

You can use next here (and not list) since json is hierarchical and only 'has' one element (the dictionary/json-object).

 

 

ADD REPLYlink written 3.4 years ago by carlfeberhard390

My goal is to retrieve a JSON dataset from history (which is an output of another pipeline) and feed this JSON dataset as a JSON object to the MutationInspector.js's. I will try to use the "dumps" method now. 

ADD REPLYlink written 3.4 years ago by asmariyaz2310

Ok your solution worked like a charm!! Thanks...

ADD REPLYlink written 3.4 years ago by asmariyaz2310
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 173 users visited in the last hour