I am working with human whole genome sequence. I have called variant using samtools and I have annotated my VCF file using SnpEff. In the resultant file I have got only the Ensembl gene and transcript IDs. According to SnpEff documentations, SnpEff supports RefSeq as well, but I am not getting any RefSeq gene or transcript ID. How can I annotate my VCF using SnpEff so that I will get RefSeq gene and transcript IDs for individual variants? Please, help me with this.

Hello all,

I found the solution for the above. I downloaded the complete set of genes and their RefSeq IDs from UCSC website in text format. Copied whole genome sequence (reference), in fasta format, to the same folder. Edited my existing human database (GRCh38) in SnpEff using the following command:

java -jar snpEff.jar build -refSeq -v GRCh38.86 (Note: I am using .86 version).

After editing, annotated my VCF file to obtain resultant file with RefSeq ID instead of Ensembl Ids.

For detailed explanation, go through , Building Database.

Thanks, Preeti

