Question: Tools: Operate On Genomic Intervals: Merge
gravatar for Erika
12.1 years ago by
Erika100 wrote:
Dear Galaxy Help, I was wondering if it would be possible to get the coordinates that caused the merge as the output from "Tools: Operate on Genomic Intervals: Merge the overlapping intervals of a query", rather than the entire merged interval as the output. Kind of like the output from "Intersect: Overlapping Pieces of intervals" option, which returns the exact base pair overlap between two queries. It might be helpful in some cases to see only the coordinates that caused the merge. From my limited Galaxy knowledge, by using the "Intersect" option and comparing a file to itself, the output would also include those complete overlaps of interval_1 in file1 to it's copy interval_1 in file2. If there is already a way to get just the coordinates that caused the merge, I would be interested to learn more. Thanks again for your help! - Erika ********************************************************** E.M. Kvikstad Academic Computing Fellow IGDP Genetics Center for Comparative Genomics and Bioinformatics The Pennsylvania State University 208 Mueller Lab University Park, PA 16802 (814) 863-2185
galaxy • 967 views
ADD COMMENTlink modified 12.1 years ago by Ian Schenck40 • written 12.1 years ago by Erika100
gravatar for Ian Schenck
12.1 years ago by
Ian Schenck40
Ian Schenck40 wrote:
Erika, Cluster, using a distance of 0, does the exact same thing as merge. However, you can specify a minimum number of intervals per cluster (2 ensures you're only grabbing merging intervals). Maximum distance can be set to a negative number, which the forces overlap (-1 forces 1 bp of overlap). You can also tweak your output to either merge, group (clustered intervals will be grouped together) or preserve the original ordering of the file. I think that is what you are trying to do. The other possibility is that you want to capture the overlapping regions of intervals within the same file. When two intervals are merged, they might not actually have any overlap. They only need to be touching, as in [a,b),[b,c) would be merged to [a,c). The overlapping interval there is [b,b), which doesn't really make sense (the length of that interval is 0). I can easily write a tool to find regions that are referenced more than once (i.e. overlap with other intervals in the same file). However, this will not include that one case where two intervals are merged because they are next to each other. I hope this helps, _Ian
ADD COMMENTlink written 12.1 years ago by Ian Schenck40
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 174 users visited in the last hour