I have inherited a project where the original VCF files were lost. I have the excel files derived from the VCF files for normal and tumor samples. I would like to compare the variants in these. Can I do this in Galaxy? I don't know of any variant manipulation packages that accept excel files.
Converting the data back into VCF format would be the way to use the data analysis in Galaxy.
I don't know of any specific tools that automatically convert Excell-to-VCF. You are unlikely to find a script/command-line that would do this, without some tuning, as the format during the VCF-to-Excell transfer can be manipulated in many custom ways.
What you can do:
- Exporting the data into tabular format would be the start.
- Then try to manipulate it from tabular back into VCF format.
- Review the data manipulation tools in Galaxy. Look in the Text manipulation and Datamash tool groups.
- Edit the file yourself with a text editor, unix or other.
- Search general bioinformatics websites for tips, shared scripts, etc.
FAQ that includes a link to the current VCF specification: https://galaxyproject.org/learn/datatypes/#vcf
Hope this works out! Jen, Galaxy team