Question: Bedtools Error
1
gravatar for Tamara Simakova
4.6 years ago by
Tamara Simakova20 wrote:
Hello! I'm using BedTools on the main Galaxy server. I try to use multiinter function to compare two bed files, but there is an error in the resulted file. Some coordinates that are present in both files are marked as they present only in one file. The bed-files format is correct. What could be the problem? Thanks, Tamara Simakova
bedtools • 1.1k views
ADD COMMENTlink modified 4.6 years ago by Jennifer Hillman Jackson25k • written 4.6 years ago by Tamara Simakova20
1
gravatar for Jennifer Hillman Jackson
4.6 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hi Tamara, The results are correct, let me explain how to interpret the output. For lines that have multiple input file names in the 5th column ("tag"), this means that the set of overlapping intervals between the input files represented by that line were completely contained within one of those intervals - let's call this case "common set". When there is just a single file in column 5th, this means that the common intervals did not have a spanning interval in the input - and call this case a "common item". In the output, both are reported: - the largest interval is what is reported in columns 1,2,3 - number of intervals in the set represented by this largest interval (and this line) column 4 - each file is listed as a tag (a form option) column 5 - how many intervals were in the set is reported in the last columns, per input file This is your header, the result of the "BEDTools -> Intersect multiple sorted BED files" function: chrom start end num list IAD39777_one-based_half- opened.bed new_bad_regions.bed _ __Example in your data of a "common set":_ See this line the result dataset: chr7 117180358 117180364 2 IAD39777_one-based_half-opened.bed,new_bad_regions.bed 1 1 The interval above is found exactly in the input dataset "new_bad_regions.bed". But, you will find the other interval that is completely contained by the reported interval in the other input "IAD39777_one-based_half-opened.bed", as this: chr7 117180176 117180459 _Example in your data of a "common item":_ This is what you sent along in the attachments as highlighted lines. Because there is no interval in either input dataset that spans all overlapping intervals in the set with overlap, each are reported individually. BUT - all are reported - in the "Common intervals output" - so this is correct. The rows are filled out according to the same rules as above, per line, as if each is a set of one, with overarching knowledge that it is in common (overlapping) with others in the output. Try a tool in the tool group "Operate on Genomic Intervals" for more options. Hopefully this helps, Jen Galaxy team -- Jennifer Hillman-Jackson http://galaxyproject.org
ADD COMMENTlink written 4.6 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 175 users visited in the last hour