Tool name: Variant Recalibrator
Tool version: 0.0.4
Tool ID: toolshed.g2.bx.psu.edu/repos/devteam/variant_recalibrator/gatk_variant_recalibrator/0.0.4
ToolShed URL: https://toolshed.g2.bx.psu.edu/view/devteam/variant_recalibrator
I have been trying to implement the GATK best practice protocol (Geraldine A. Van der Auwera et al.,
Curr Protoc Bioinformatics. ; 11(1110): 11.10.1–11.10.33. doi:10.1002/0471250953.bi1110s43) using the main Galaxy server; since I am more comfortable using Galaxy.
I got as far as getting a raw (SNP) variant.vcf from Unified Genotyper (HC is not available in Galaxy), using the realigned and BQSRed BAM. The annotations used in the subsequent Variant Recalibration instructions are included in the raw_SNP.vcf. However, when I try to use Variant Recalibrator, I consistently get the following error:
An error occurred with this dataset:
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/galaxy-repl/main/scratch [Sun Feb 01 14:43:22 CST 2015] net.sf.picard.sam.CreateSequenceDictionary REFERENCE=/galaxy-repl/main/scratch/tmp-gatk-74BJCX/gatk_input.fasta OUTPUT=/galaxy-repl/main/scratch/tmp-gatk-7
I am using the GATK ucsc.hg19.fasta (from Galaxy shared data) , my input raw_SNP.vcf and.................................
I have tried using the hapmap_3.3.hg19.vcf (both downloaded from the GATK Galaxy shared data and the one from the GATK bundle, on their FTP site); 1000gomni2.5hg19.vcf (both downloaded from the GATK Galaxy shared data and the one from the GATK bundle, on their FTP site); dbSNP_135.hg19.vcf(both downloaded from the GATK Galaxy shared data and the one from the GATK bundle, on their FTP site) and the 1000g-phase1.snps.highconfidence.hg19.vcf (from GATK bundle, FTP site).I have tried to enter the known, training, truth details both explicitly (and not).
I have checked the following annotations (that are present ONLY in the raw_SNP.vcf file; not the ROD files- although I don't see how I can annotate them without their respective BAM files?)
DP
QD
FS
MQRankSum
ReadPosRankSum
I have tried to leave the GATK and analysis tabs at "Basic". The only difference here (I can see) is that the paper suggests 0.01 percentBad and the Galaxy default is 0.03 (I have tried both though)
Any help would be appreciated
DoireRosie