Question: Analysing Pooled Data With Freebayes
0
gravatar for Nicola Smith
5.6 years ago by
Nicola Smith10
Nicola Smith10 wrote:
Hi, I am new to this and I hope someone can help. I have pooled sequencing data that I am trying to analyse using Galaxy. I've done quite a bit of online searching and it seems that FreeBayes should be able to do this, if I select "set population", click the "Assume that samples result from pooled sequencing" option and change the ploidy to nx2 (number of alleles, where n is the number of subjects and the organism is diploid). However, whenever I do this I get an error: usually just "Killed" I was originally setting my polidy rather high (190 as I have 95 subjects pooled), so I wondered if this was the problem, however, it fails if I do a ploidy of only 4 too. I've tried various things to try to see where I am going wrong: All with the same BAM file: Set population model options: Do not set --> works Set population model options: set, Assume that samples result from pooled sequencing: not ticked, Default ploidy for the analysis: 2 --> works Set population model options: set, Assume that samples result from pooled sequencing: ticked, Default ploidy for the analysis: 2 --> works Set population model options: set, Assume that samples result from pooled sequencing: ticked, Default ploidy for the analysis: 4 --> fails (killed) Set population model options: set, Assume that samples result from pooled sequencing: ticked, Default ploidy for the analysis: 10 --> fails (killed) Set population model options: set, Assume that samples result from pooled sequencing: ticked, Default ploidy for the analysis: 10 --> fails (killed) It seems that it is the ploidy part that I am doing wrong, as it works with pooled data but ploidy of 2. I'm sure I have to change the ploidy though, or else how does the program know how many subjects are in the pool? Also, everywhere that I've ready says you have to change the ploidy. I apologise if my question is naive. As I said, I am new to Galaxy and this is the first thing I am trying to do! Any help / suggestions would be appreciated, Thanks, Nic
galaxy • 1.8k views
ADD COMMENTlink modified 5.6 years ago by Jennifer Hillman Jackson25k • written 5.6 years ago by Nicola Smith10
0
gravatar for Jennifer Hillman Jackson
5.6 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hi Nic, Yes, the program is running into a memory issue with this setting (confirmed by reviewing your bug report, thank you!). This is not an issue that is localized to Galaxy or even our server/cluster, but seems to be with the tool itself and it comes up on different systems under different cases when deviating from a ploidy setting of 1 or 2. So, sticking with ploidy = 2 is one option. You might try contacting the tool author at the Freebayes google group for more detailed advice, the link is: https://groups.google.com/forum/#!forum/freebayes Best, Jen Galaxy team -- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org
ADD COMMENTlink written 5.6 years ago by Jennifer Hillman Jackson25k
Hi, I wanted to identify that I've resolved this issue. The problem is that the tool must consider the likelihoods of all possible un-phased genotypes with N alleles and M copies across P pools. This becomes quite a big number when N becomes large and M is large, as it might in low-complexity loci and deep pools. See: https://github.com/ekg/freebayes/commit/576bc703c246035342538a0feeecd1 3c28f3d2eb, and also https://groups.google.com/forum/#!topic/freebayes/R6dReM4sPoQ for a discussion of how this can be dealt with. The --use-best-n-alleles option was previously targeted only for SNPs, which made it ineffective at dealing with the combinatoric expansion as most multiallelic loci contain indels or other kinds of non-SNP variation. In the most recent version this can be set low (e.g. 2 or 3 in your case) to prevent the memory blowup. The current version of freebayes is not currently in Galaxy--- but I am working on getting the most recent version of freebayes available there. Sorry for the troubles. I hope you'll still have a chance to analyze your data with the pooled functionality in freebayes. Erik
ADD REPLYlink written 5.4 years ago by Erik Garrison20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 171 users visited in the last hour