Hi everybody, After different tries, I realize one thing about the BEDtools tool ShuffleBED and I will be happy to have some feedback about it and know if this problem was already seen by someone else. What I would like to do is easy : randomize the genomic regions I have, to see if the criteria of these regions are unexpected or not. On shuffleBED, I use options where I respect the distribution of my regions by chromosome, I exclude unmappable regions, and I exclude the possibility to have overlaps.
After analysis of the random file I obtained, the chromosome location are respected, the size of my genomic regions too, but I see that some genomic regions overlap between them. This effect is less important (divided by around 4) if I multiply by 10 the number of Tries for the positioning (maxTries option). But at the end, they are still some domains that overlap. I suppose that it's depends of how many regions, and furthermore, what are the size of the regions that are already re - place by random. Indeed, if lot of short regions are already positionned, there is less and less possibility to have large space to place a large region. So shuffleBED return the last found position by random to put the big region, even if this position overlap with another region. Am I right on my understanding of the work of this tool ? Finally, my question is the following : is someone else has already observed this problem, how to manage with ?
I thought to order my input file thanks to length of the regions that I want to randomly positioned, But first of all, this bias the randomization, and secondly, ShuffleBED randomize the size of the regions to position. So it's not the solution. So what else ?
Thank you for your feed back.
I will appreciate your help.
- MO -