Question: Empty VCF file returned by ANNOVAR on Galaxy
1
gravatar for arvand.akbari
23 months ago by
arvand.akbari10 wrote:

Hi,

I have a VCF file which I need to annotate through ANNOVAR. But when I use the ANNOVAR tab provided by galaxy the result is a completely empty VCF file. Can someone please explain why this happens and what I should do?

P.S: I am not familiar with linux so I cannot use ANNOVAR directly,. Thus, I have to use it through galaxy.

Arvand

annovar galaxy • 1.3k views
ADD COMMENTlink modified 19 months ago by marcocassone0 • written 23 months ago by arvand.akbari10
0
gravatar for Jennifer Hillman Jackson
23 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

This tool is indexed for the genome hg19 only at http://usegalaxy.org. Inputs should be assigned hg19 as a database and then annotations selected. It is possible that there are no overlaps between your VFC and the annotation, but more likely there is a genome mismatch problem. https://wiki.galaxyproject.org/Support#Reference_genomes

If you are working with any other reference genome, this tool can be used in a local or cloud Galaxy. There is some set up, but we have instructions. Please see: https://wiki.galaxyproject.org/BigPicture/Choices

Best, Jen, Galaxy team

ADD COMMENTlink written 23 months ago by Jennifer Hillman Jackson25k

I would really appreciate if you let me know your opinion about the errors I posted below

ADD REPLYlink written 23 months ago by arvand.akbari10
0
gravatar for arvand.akbari
23 months ago by
arvand.akbari10 wrote:

Hi Jen,

Thank you for your thorough reply. I think you are right about there being a mismatch; Because I also get a number of errors when I try to proceed with annotation on another tool. However, that other tool seems to be more tolerant to the mismatch and proceeds with the job by skipping approximately 700 variations; interestingly, all of which belong to ch1 and ch2. There are only two error types that get repeated many times. I am pasting them below for you to see:

reference allele not single, possible indel: chr1 978603 . CCT C 276.87 PASS AC=1;AF=0.50;AN=2;BaseQRankSum=-0.508;DP=24;FS=0.000;HRun=0;HaplotypeScore=56.8851;MQ=52.36;MQ0=0;MQRankSum=1.397;QD=11.54;ReadPosRankSum=1.397;SB=-89.83;TI=NM_198576;GI=AGRN;FC=Noncoding GT:AD:DP:GQ:PL:VF:GQX 0/1:17,7:24:99:316,0,784:0.292:99

unrecognizable alternative allele(s): chr1 6530965 . C CG 1983.3 PASS AC=2;AF=1.00;AN=2;DP=52;FS=0.000;HRun=5;HaplotypeScore=101.3783;MQ=59.88;MQ0=0;QD=38.14;SB=-731.61;TI=NM_001042665,NM_001042664,NM_020631,NM_001042663,NM_198681;GI=PLEKHG5,PLEKHG5,PLEKHG5,PLEKHG5,PLEKHG5;FC=Noncoding,Noncoding,Noncoding,Noncoding,Noncoding GT:AD:DP:GQ:PL:VF:GQX 1/1:1,50:52:99:2025,156,0:0.980:99

So do you think these errors are caused by mismatch?

And what do you think is the best course of action? Analyzing the FASTQ file again or resequencing the sample?

Best,

Arvand

ADD COMMENTlink written 23 months ago by arvand.akbari10

Is the data annotated as "ch1" in some data and "chr1" in another? This includes the base reference genome. I wouldn't expect this from the error through if the output above is from a single input VFC.

Instead, it looks like the VCF has some data that the tool is unable to process. See this guide for expected format and troubleshooting: http://annovar.openbioinformatics.org/en/latest/articles/VCF/

Recreating the VCF within Galaxy could be the solution. Going back to sequencing is too far. This is a technical problem with the inputs versus the tool.

ADD REPLYlink modified 23 months ago • written 23 months ago by Jennifer Hillman Jackson25k

Thank you for your advice. I will try regenerating VCF

ADD REPLYlink written 23 months ago by arvand.akbari10
0
gravatar for marcocassone
19 months ago by
marcocassone0 wrote:

Annovar don't fuction also with hg19 why? thanks

ADD COMMENTlink written 19 months ago by marcocassone0

Are you using http://usegalaxy.org? hg19 is supported there. If you are having a problem, please submit a bug report and we can review and give feedback. Make certain all inputs and outputs are undeleted and include a link to this Biostars post: https://wiki.galaxyproject.org/Support#Reporting_tool_errors

Other servers, including locals, determine on their own which genome databases to index and support. This is part of installing and configuring the tool. Instructions for "how-to" are on the tool's readme in the Tool Shed http://usegalaxy.org/toolshed

Thanks! Jen, Galaxy team

ADD REPLYlink written 19 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 107 users visited in the last hour