Question: INTERSECT Genomic Intervals returns empty files
0
gravatar for kylvalda
20 months ago by
kylvalda0
kylvalda0 wrote:

Hi,

somehow the Operate On Genomic Intervals - Intersect the intervals of two data sets doesn't work. Data sets which clearly have overlapping intervals (I checked on UCSC browser) return empty intersect files (green, no error message). I generated narrow peak files from peak calling in galaxy and then tried to intersect those: empty files on 3 technical replicates from a published dataset. Then cut just the first 6 columns and converted into bed with defined attributes: still the same.

Can somebody please check what's going on with that tool?

I wanted to use Galaxy for a workshop on ChIP-Seq data analysis, but it seems a bit buggy :/

???

galaxy • 616 views
ADD COMMENTlink modified 19 months ago by Jennifer Hillman Jackson25k • written 20 months ago by kylvalda0

Hi kylvalda,

Are you working at http://usegalaxy.org or can you reproduce the problem there? If so, we'd like to review the inputs and tool to find out what exactly is going on. This is how to send in a bug report: https://galaxyproject.org/issues/#usage-problem-reporting

Please include a link to this post in the comments of the bug report.

Thanks! Jen, Galaxy team

ADD REPLYlink written 19 months ago by Jennifer Hillman Jackson25k

Yes, I work on main. Have a look at parallel thread on GitHub: https://github.com/galaxyproject/galaxy/issues/3905#issuecomment-292720519

Have a look at my history https://usegalaxy.org/u/atvb2017/h/unnamed-history

Thanks

Sylvia

ADD REPLYlink written 19 months ago by kylvalda0

Perfect, reviewing now

ADD REPLYlink written 19 months ago by Jennifer Hillman Jackson25k
0
gravatar for Jennifer Hillman Jackson
19 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

The problem was with the ambiguous strand assignment ("."). Replace this with a forward strand assignment ("+") and the tool works as expected.

How to:

  • Use the tool Replace Text in entire line
  • Run on both input datasets
  • Find pattern: \t\.\t
  • Replace with: \t\+\t
  • Rerun the overlap tool on the modified inputs

The UCSC browser will interpret a dot "." as a "+" as part of that site's display implementation. Many Galaxy tools will do the same, yet not this particular tool and it unlikely to be updated. So, for best results, assign strand, for any tools in this group or others that seem to have trouble with the ambiguous strand assignment. For Chip-seq data, assigning the forward strand is appropriate.

Thanks, Jen, Galaxy team

ADD COMMENTlink written 19 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 175 users visited in the last hour