Question: Loss of original VCF file; therefore in Galaxy, can I use VCF manipulation on an excel file (originally converted from VCF format)
9 months ago by
eleni.stylianou0 wrote:

I have inherited a project where the original VCF files were lost. I have the excel files derived from the VCF files for normal and tumor samples. I would like to compare the variants in these. Can I do this in Galaxy? I don't know of any variant manipulation packages that accept excel files.

9 months ago by
United States
Jennifer Hillman Jackson25k wrote:


Converting the data back into VCF format would be the way to use the data analysis in Galaxy.

I don't know of any specific tools that automatically convert Excel-to-VCF. You are unlikely to find a script/command-line that would do this, without some tuning, as the format during the VCF-to-Excel transfer can be manipulated in many custom ways.

What you can do:

  • Exporting the data into tabular format would be the start.
  • Then try to manipulate it from tabular back into VCF format.
    • Review the data manipulation tools in Galaxy. Look in the Text manipulation and Datamash tool groups.
    • Edit the file yourself with a text editor, unix or other.
    • Search general bioinformatics websites for tips, shared scripts, etc.
    • The Galaxy Main Tool Shed does have some tools for working with Excel data, but none do the operation you want. Search with the keyword "excel" to find/review these. If any do seem useful -- to you or others reading -- the tools are for use in your own Galaxy (not hosted at the public Galaxy Main site).

FAQ that includes a link to the current VCF specification:

Hope this works out! Jen, Galaxy team

