Question: Bowtie2 Select Reference Genome; eliminating false positive variants
Three questions: 1) When specifying a reference genome, what are the advantages and disadvantages of using Hs38 vs Hs38 canonical? 2) On Bowtie2, under "select reference genome", my workflow says Hs38 (Homo sapiens), but cuts off before I can determine if it is set ot Hs38 or Hs38 canonical. Clicking on it gives the list of options without highlighting which one was selected, so I have no way of knowing, other than remembering what I think I selected, and hoping that by looking, I haven't changed the selection. Can this be modified? 3) What is the syntax to eliminate variant sites where the chance of a false positive call is greater than 1/10000? is it -f "QUAL > 40" in VCF filter, and is VCF filter or Naive Variant Caller the best tool to do this? I haven't yet stumbled on a place where quality syntax is explained.


For each question:

  1. The canonical build excludes haplotypes and unmapped "chromosomes". This can result in clearer results, in particular when reference genome annotation (GTF, GFF3) is used in an analysis that does not include annotation for those haplotypes/unmapped reference sequences.

  2. The workflow editors side bars (panes) can be expanded when content is too long in length to by fully viewed. Please see this graphic for an example: The selection can be modified within the editor or at run time.

  3. Either tool can filter by quality (as can a few others). Your syntax is correct for the VCF Filter tool. Mapping quality (QUAL) is entered as a simple numerical value for the Naive Variant Caller tool. Did you need help with Galaxy form entry syntax for a different tool? Let us know and we can try to help clarify.

Thanks! Jen, Galaxy team

Reference: VCF specification.

