Question: Genome source from history in SnpEff
0
gravatar for k2low
2.8 years ago by
k2low0
k2low0 wrote:

Hello, I am using SnpEff in Galaxy to add variant information to my vcf file. Please let me ask 2 questions about "Genome source" in SnpEff.

1. I successfully downloaded GRCh38.76 database for SnpEff to my history using SnpEff Download (green in my history), and tried to use it at SnpEff. I thought the database in my history would have shown up when I had selected "Reference genome from your history" at Genome source. But the column remained "No senpeffdb dataset available". Would tell me what I am wrong?

2. I was able to use SnpEff using "Named on demand" instead of the "Reference genome from your history". I wonder what is the advantage to download the database to the history.

Thanks in advance!

Kei

snpeff galaxy • 935 views
ADD COMMENTlink modified 2.8 years ago by Jennifer Hillman Jackson25k • written 2.8 years ago by k2low0
0
gravatar for Jennifer Hillman Jackson
2.8 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

The option for downloading the genome into the history is intended to reduce load. Download it once, then reuse. However, this usage seems to be problematic (feedback from our team is pending). I suspect there is an issue with the dbkey assignment between the inputs (meaning, the database assignments for "dbkey" are a mismatch). SnpEff genomes include incremental versions in the key, while native genomes at http://usegalaxy.org do not plus are often named in a different way (using UCSC identifiers, etc). It may be possible to create a custom reference genome "build" that uses the exact same dbkey as SnpEff (database attribute) that would allow this option to work, but that has not been tested by me and seems tedious for large genomes. It could also trigger a memory problem, since that custom reference genome would need to be used for all steps in the analysis (not just the SnpEff annotation step).

Using the option "Named on demand" does seem to functions in small tests. Although I should mention that certain tool options on the tool form are passed to the command-line in deprecated format in many cases. This issue can be tracked here (and may not be the root issue): https://github.com/jennaj/support-known-issues/wiki

I suggest using the "Named on demand" option when working on http://usegalaxy.org. If working on your own local/cloud, then the native genome indexes could be created in a way that the dbkey is a match for the SnpEff genome dbkeys, and tested. Problems can be reported to the tool authors through the Tool Shed (http://usegalaxy.org/toolshed) or in Github.

If our team has more feedback, we will post an update.

Sorry for the confusion in usage, Jen, Galaxy team

 

ADD COMMENTlink written 2.8 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 175 users visited in the last hour