Question: How to get the name of the file (.dat) through the API?
0
gravatar for vaskin90
3.5 years ago by
vaskin9030
vaskin9030 wrote:

I'm trying to use Galaxy for the NGS analysis. I'm running it on a server, so I don't use/need the GUI. I need to do merging of multiple files (actually it is merging of BAM). So I need to run the workflow with multiple inputs (I don't know the number in advance). Ok, there are steps, but using them, I cannot do the merging, since at the step the merging workflow has only one dataset at a time, right? So I was wondering if I can get the paths to actual files (to .dat) file through the API, so I can combine them in a comam-separated line and pass the my merger.

ngs api • 879 views
ADD COMMENTlink modified 3.5 years ago by jmchilton1.0k • written 3.5 years ago by vaskin9030
2
gravatar for jmchilton
3.5 years ago by
jmchilton1.0k
United States
jmchilton1.0k wrote:

I would recommend against passing explicit file paths to your tool like this - it will break down Galaxy abstractions for security, provenance, etc... - Galaxy tools should consume datasets not files. Alternative approaches include rewriting the workflow on the fly to accommodate your number of inputs, breaking the execution into a few workflows and using the tools API to merge the in between the mapping steps at the beginning and the steps after the BAM merging, or using the new support for such workflows available by creating dataset collections

However, if you still want this information it certainly can be obtained from the API - the result of the API call GET /api/histories/<encoded_history_id>/contents/<encoded_dataset_id> should contain an attribute named file_name. By default only admins can see this, but the configuration expose_dataset_path in your universe_wsig.ini can be set to True to expose this information to all users.

ADD COMMENTlink written 3.5 years ago by jmchilton1.0k

Thank you, John.

Actually, I'd love to use datasets and it looks like, that dataset collection can help me. But since they are under development, I'd wait a while when they are well documented and stable.

Breaking the workflow won't solve the case, since the multiple datasets still have to be passed somehow. 

Rewriting the workflow on the fly? That could work, but I didn't find a natural way of doing this. I'm using BioBlend, that does not have this option.

Using all the solutions above (except dataset collections) would make my code complicated.

expose_dataset_path is what I needed. And so far I would stick to this solution, that would be the only hack.

 

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by vaskin9030
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 131 users visited in the last hour