Question: How to intersect vcf files for commonly mutated genes only (not the exact mutation)
gravatar for lim.michelle.25
10 weeks ago by
lim.michelle.250 wrote:

Hi all,

I intersected two vcf files using vcf-vcf intersect on and I realized that I'm not getting as many results because the program looks for exact mutations between two vcf files (ex. Jak3 arginine to histidine at amino acid position 653). What I want is to be able to look for genes that are commonly mutated between two files regardless of the specific type of mutation. Is this possible?


mutation vcf intersect galaxy • 131 views
ADD COMMENTlink modified 10 weeks ago by Jennifer Hillman Jackson25k • written 10 weeks ago by lim.michelle.250
gravatar for Jennifer Hillman Jackson
10 weeks ago by
United States
Jennifer Hillman Jackson25k wrote:


There are a few ways to do this.

Galaxy Tutorials:

A VCF dataset could be annotated with a custom BED dataset with VCFannotate.

Alternatively, see the tools SnpEff and SnpSift. Annotate the VFCs with gene annotation then extract that info into tabular datasets or load into Gemini to query the content.

To merge the VCF, use VCF Combine, not Intersect. Make sure each has the samples labeled.

The tool groups of interest to you are primarily: NGS: Variant Analysis, NGS: VCF Manipulation, NGS: Gemini (for working with VCF data) or Text Manipulation, Datamash, Filter and Sort, and Join, Subtract and Group (for working with tabular data).

Thanks! Jen, Galaxy team

ADD COMMENTlink modified 10 weeks ago • written 10 weeks ago by Jennifer Hillman Jackson25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 180 users visited in the last hour