Question: How to annotate 500 vcf files using SNPEff or smth else and put every separate result in a folder with the name of the SRA identifier?
1
gravatar for msprindzhuk
19 months ago by
msprindzhuk50
msprindzhuk50 wrote:

How to annotate 500 vcf files using SNPEff or smth else and put every separate result in a folder with the name of the SRA identifier? Is it implementable using Galaxy tools?

snpeff batch annotation • 745 views
ADD COMMENTlink modified 19 months ago by Guy Reeves1.0k • written 19 months ago by msprindzhuk50

If the datasets (files) are named with the SRA identifier then there is a way which I can explain.

ADD REPLYlink written 19 months ago by Guy Reeves1.0k

explain, OK..................

ADD REPLYlink written 19 months ago by msprindzhuk50
2
gravatar for Guy Reeves
19 months ago by
Guy Reeves1.0k
Germany
Guy Reeves1.0k wrote:

OK so if you have 500 VCF files in a history. AND dataset renaming works for this tool (it does for 90% of tools).

1 make a short workflow with any data input datasets joined to the SNPEff tool. Using the workflow is to enable you to use the dataset renaming capacity. In the workflow window of the tool scroll down and click on 'Configure Output: 'snpeff_output''. This should open some options.

2 got to 'Rename dataset'. and add this to this box '#{input}. This uses the information given at the top of the tool info 'Data input 'input' (vcf, tabular, pileup or bed)' then hopefully your output datasets will be named the same at the input vcf dataset. if you want more info on naming 'Click here for more information. ' near the 'Rename dataset' box. there is info on how to add a suffix to the dataset name

3 (optional) personally I also add something to the 'Tags' just below, as this allows you to easily collect all the output files into a single history using this trick.trick to collect tagged datasets

4 Click outside of the last box you have edited and save the workflow (don´t forget)

5 go to the history with all the datasets you want to work with and run the work flow you have saved. Where you select the VCF files click on the little icon wwhich looks like a pile of papers this will allow you to select multiple vcf at a time. For testing just select a few don´t got for 500 first time.

6 Select other files required by the tool then scroll down to bottom.Check 'Send results to a new history' check box . then 'Run workflow'

7 if you are doing just a few files then you should see a list of named datasets appear. This will tell you if the renaming has worked as you wanted. (when you do this for 500 files you will almost certainly get a 'refresh error' which you can ignore. A new history will be generated for each vcf.

8 you can go to users>saved histories to monitor progress across all the histories, as you refresh this you will see 1 history created for each workflow and the datasets should be named with the original file

Is this what you wanted ? did it work?

cheers

Guy

ADD COMMENTlink modified 19 months ago • written 19 months ago by Guy Reeves1.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 175 users visited in the last hour