How to annotate 500 vcf files using SNPEff or smth else and put every separate result in a folder with the name of the SRA identifier? Is it implementable using Galaxy tools?
OK so if you have 500 VCF files in a history. AND dataset renaming works for this tool (it does for 90% of tools).
1 make a short workflow with any data input datasets joined to the SNPEff tool. Using the workflow is to enable you to use the dataset renaming capacity. In the workflow window of the tool scroll down and click on 'Configure Output: 'snpeff_output''. This should open some options.
2 got to 'Rename dataset'. and add this to this box '#{input}. This uses the information given at the top of the tool info 'Data input 'input' (vcf, tabular, pileup or bed)' then hopefully your output datasets will be named the same at the input vcf dataset. if you want more info on naming 'Click here for more information. ' near the 'Rename dataset' box. there is info on how to add a suffix to the dataset name
3 (optional) personally I also add something to the 'Tags' just below, as this allows you to easily collect all the output files into a single history using this trick.trick to collect tagged datasets
4 Click outside of the last box you have edited and save the workflow (don´t forget)
5 go to the history with all the datasets you want to work with and run the work flow you have saved. Where you select the VCF files click on the little icon wwhich looks like a pile of papers this will allow you to select multiple vcf at a time. For testing just select a few don´t got for 500 first time.
6 Select other files required by the tool then scroll down to bottom.Check 'Send results to a new history' check box . then 'Run workflow'
7 if you are doing just a few files then you should see a list of named datasets appear. This will tell you if the renaming has worked as you wanted. (when you do this for 500 files you will almost certainly get a 'refresh error' which you can ignore. A new history will be generated for each vcf.
8 you can go to users>saved histories to monitor progress across all the histories, as you refresh this you will see 1 history created for each workflow and the datasets should be named with the original file
Is this what you wanted ? did it work?
cheers
Guy
If the datasets (files) are named with the SRA identifier then there is a way which I can explain.
explain, OK..................