Question: SnpEff output searching for specific genes
federico.bernuzzi20 wrote:


I have successfully runned SnpEff. The ouput VCF file contains 2.5 million "lines", data sets. I wanted to ask if someone is interested in looking at only a couple of genes how would I go about doing this? In addition is there a chance than the output VCF file can be transported into a different programme for example excel?

Guy Reeves1.0k wrote:

maybe you can use 'VCF-BEDintersect:' tool to slice out the genes using their coordinates in a BED file or you can type the interval in. Cheers Guy

Agree - another good choice!

Jennifer Hillman Jackson25k wrote:


The tool Select can be used to pull out lines that match a specific text string such as a gene ID. There are other data manipulation tools that can filter data by matching text or chromosome coordinates (example: VCFfilter).

VCF is a special version of tabular data, so importing into Excel or other spreadsheet applications is definitely possible. Download the data and change the file extension to be ".txt" or ".tabular". Remove the header before or after downloading or set those lines as a header within the downstream application. The same Select tool can be used to remove header lines in Galaxy ("not matching" the pattern ^#).

Hope this helps! Jen, Galaxy team

federico.bernuzzi20 wrote:

Many thanks for the reply I shall look into it.

