Question: getting number of SNPs, insertion/deletion from VCF with VCFfilter
gravatar for lorenasfe
2.7 years ago by
lorenasfe50 wrote:


I am trying to extract from a VCF file the number of SNPs: Single nucleotide variants , Insertion/deletions variants, multi-nucleotide variants and variants with multiple alternate alleles. I am trying to use VCFfilter from Galaxy but I am getting error continuosly when I tried to filter. What do I have to write in 'Specify filterting expression'? Is there another way to get that information?

vcffilter indels snp galaxy vcf • 4.0k views
ADD COMMENTlink modified 23 months ago by chen.randy110 • written 2.7 years ago by lorenasfe50
gravatar for chen.randy
23 months ago by
chen.randy110 wrote:

use VCF Filter, input such as: -f "TYPE = del" to filter what you need

ADD COMMENTlink written 23 months ago by chen.randy110

this helped me! thnx chen.randy

ADD REPLYlink written 23 months ago by ron10
gravatar for Guy Reeves
2.6 years ago by
Guy Reeves1.0k
Guy Reeves1.0k wrote:

HI I can help you write an expression for VCFfilter, but can you look and see if using *NGS: GATK Tools (beta)>Select Variants from VCF files will let you do what you want as it is easier- at least I think so

select Basic or Advanced Analysis options>Advanced

then scroll down

and check which ever boxes you want 'Select only a certain type of variants from the input file INDEL SNP MIXED MNP SYMBOLIC NO_VARIATION'

you may also want to look at the 'Select only variants of a particular allelicity' option. This allow you to count what you want from the output .vcf files This should all work on Tell me if it works. Guy

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by Guy Reeves1.0k

Thanks so much for your help! At the end I could count with VCFfilter, but next time I will try to use GATK as you saggested, it seems much easier.

ADD REPLYlink written 2.6 years ago by lorenasfe50
gravatar for ron
23 months ago by
ron10 wrote:

Hi! I know you guys both wrote this a few months ago, but now I am in the exact same situation as as lorenasfe. Could any of you help me in the use of VCFfilter?

@ lorenasfe: how did you do it?

@ Guy Reeves: I have tried the method you explain, but I get quite lost in the process... and I cannot find the way to choose a reference genome. When I select "from the history", I am required to give a fasta file, which I do not have and which I have not used in any moment during the whole process. When I select "locally cached", it simply says "No options available" and does not allow me to make any changes.

So for these reasons, I would ask you both, or anyone else reading this post, to give me any tips. I am working with hg19. As a last resource, I even tried to use the filter in Excel, but it does not seem to bring me anywhere... I believe though, that VCFfilter should be a better tool, although I am open to hear new ideas.


ADD COMMENTlink written 23 months ago by ron10

as "locally cached" does not give you the hg19 option you want I guess you are not working on I suggest you register for an account and move your data there (or at least part of it). Then you can see what works and then work on installing reference genomes onto your local galaxy instance

ADD REPLYlink written 23 months ago by Guy Reeves1.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 169 users visited in the last hour