Question: How to select multiples files as inputs to parse them simulaneously ?
0
gravatar for jpvillemin
2.6 years ago by
jpvillemin10
France
jpvillemin10 wrote:

The question is in the title. I begin with Galaxy :)

I want to make some statistics on several files. The number of files to treat together can change. (3 or more)

For this,I  need to parse several files to get their values. My pipeline do that but i'd like to be able to integrate it inside galaxy.

For instance, i only found a way to input a file, or input a library that will be treated independently in batch mode.

But how to pass several files at the same time (or a path to a repertory) to read through it and get all file names and contents.

Update : 

Ok using 'type="data" multiple="True" ' seems to work. Now how to retrieve them. Their path, their name, and set the outputname accordinly the input ?


Thanks.

input • 1.3k views
ADD COMMENTlink modified 2.6 years ago by Bjoern Gruening4.8k • written 2.6 years ago by jpvillemin10

Can you show us your code?

ADD REPLYlink written 2.6 years ago by Bjoern Gruening4.8k
1
gravatar for Bjoern Gruening
2.6 years ago by
Bjoern Gruening4.8k
Germany
Bjoern Gruening4.8k wrote:

Hi,

I don't really now why you want to have the filename. Filenames in Galaxy have no meaning, they look like this dataset_982737.dat. So you can not do much with it.

If you have something like this:

<param name="inputs" type="data" multiple="True" label="Inputs Couverture"/>

You can do this in your command tag:

#for $input in $inputs:

    $input

#end for

or shorter

#echo ' '.join($inputs)#

Your commandline will look like this:

test.pl /path/to/database/dataset_2727.dat /path/to/database/dataset_2728.dat /path/to/database/dataset_2729.dat

Does this help?

Cheers,

Bjoern

 

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by Bjoern Gruening4.8k

Yeah kind of,

You can execute python code using # ?

 

ADD REPLYlink written 2.6 years ago by jpvillemin10

This is not really python, it is http://cheetahtemplate.org/.

ADD REPLYlink written 2.6 years ago by Bjoern Gruening4.8k

ok thanks great, i will take a look to cheetah. 

I want do to do a concat string inside a loop ? what is the syntax to do that? i'm not at work so i can't share my=y code but i tried some code but i didn't suceed to make it work.

I wan to read my input dataset...PA1,PA2,PA3 as you said :

#for $input in $inputs:

    $input

#end for

but when you do thant each value is a new paramater . I 'd like to pass them as one paramate so i ' d like to concat them together.

ADD REPLYlink written 2.6 years ago by jpvillemin10
1

Try python list comprehensions in cheetah.

Something like this should work: "#echo ' '.join($inputs)#". Note the " in front and at the end. This will group everything into one parameter like this. script.pl "file1.fasta file2.fasta file3.fasta"

ADD REPLYlink written 2.6 years ago by Bjoern Gruening4.8k

I'm not familiar with python... And the inputs name , i want to concatenate are in hash. I'm using : 

    #set $pathOuput = "TEST"
    #silent sys.stderr.write($pathOuput)
    #for $bam_count, $input_bam in enumerate( $input ):
      #$pathOuput += $pathOuput + $input_bam.name
    #end for
    #silent sys.stderr.write($pathOuput)

    # perl test.pl $pathOuput

I want to create a variable to concat all the values and  pass it to my script . But i can't succeed to make it works , the code inside the for loop doesn't work... I read cheetah doc (not much example) and your comment was a good thing but i didn't find a final solution.

 

UPDATE : finally find a way

    #set $pathOuput = ""
    #silent sys.stderr.write($pathOuput+"\n")
    #for $bam_count, $input_bam in enumerate( $input ):
      #set $pathOuput += $input_bam.name+","
    #end for
    #set $pathOuput = $pathOuput[:-1]
    #silent sys.stderr.write($pathOuput)
ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by jpvillemin10
1
gravatar for Guy Reeves
2.6 years ago by
Guy Reeves1.0k
Germany
Guy Reeves1.0k wrote:

HI I would be happy to help but think you need to simplify what you are asking as personally cannot quite follow what you are after.  It may help to post the pipeline and then explain  a example of what you want as an endpoint.  

It possibly sounds like if you do not already know about them dataset collections might be valuable. 

https://wiki.galaxyproject.org/DevNewsBriefs/2014_06_02?highlight=%28collections%29#Dataset_Collections

A: variable inputs workflow

Thanks Guy

 

ADD COMMENTlink written 2.6 years ago by Guy Reeves1.0k

I will try to explain better what i'd like to do.

I can have several input csv files. One file is for one patient.

Imagine for example, 3 files (PA1.csv,PA2.csv,PA3.csv)

Each input will have an output xls file where i compute ratio for the current patient vs each other. So i need to read the content of the three inputs to then create my ouputs.

So three i will have 3 outputs (PA1.xls, PA2.xls,PA3.xls) . ( I like to zip them for DL but that an another problem)

PA1.xls , will content ratio values PA1/PA2 and PA1/PA3.

PA2.xls will content ratio values PA2/PA1 and PA2/A3.

So i need to keep information about the filename of the file i working on. And like to have corresponding output to have the same name as the input.

I read about this to get informations about using metada but i found it really not clear. There is not a clear list of what you can grap from your input parameters. 

For example, if you have a param as the code following,  you can pass  the file name in command: 

  <param name="input_annotation" type="data" label="Input Gene Annotation"/>

<command interpreter="perl">test.pl ${input_annotation.name} </command>

What are the other define "tags"  we can use ?

Can i also grep the filename in the perl script ? or do i have to send them before via the xml ?

And last question, how you retrieve filename( or anything relative to the input file) when you treat multiple files simulaneously?

<param name="input" type="data" multiple="True" label="Inputs Couverture"/>
<command interpreter="perl">test.pl ${input.name} </command> won't work.

Thanks i hope it's clearer. I didn't post the pipeline because it will be a mess (there is other parameters...) so i did prefer explain the main idea. :)

 

ADD REPLYlink written 2.6 years ago by jpvillemin10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 109 users visited in the last hour