Question: Galaxy Problems
0
Mark Bieda • 10 wrote:
Hello All,
I'm a member of the ENCODE TR group and have done a fair amount of
programming.
Galaxy seems a well-designed approach to analysis.
But there are a number of problems, it seems, in my basic testing.
(1) First of all, I try to load a gff file - that works fine. But the
"edit
attributes" won't correctly assign the columns to the data for doing
interval stuff - I try to do an overlap, and it says something about
startcol being undefined.
(2) Generally, I strongly advise that you allow direct loading of gff
data
and recognition of this format. It's easy to do. (But this is lower
priority, I understand).
(3) So I create my own interval files as a test - one file is a subset
of
the other, with a small difference in the sizes of some intervals
(with
strand information, FYI). I compute the overlap and the difference.
The
overlap is correct, the difference looks probably ok. I then do the
union
of the overlap and the difference. This should lead to my original
data -
but no, it doesn't.
(4) I mention the strand information because it seems that the
difference
eliminates this info (bizarrely) and the overlap keeps it.
(5) As a general comment, I would say that I am quite used to
bioinformatics and programming. If I am having these sorts of
problems,
this will be very hard on more experimentally oriented biologists.
Ok, so I'm writing this because, like I said, Galaxy looks like a very
well-thought out approach to doing this stuff - I'm impressed with the
overall project approach - but I think that it doesn't seem to be
working
very well right now - or you are over my head.
Say hello to Ross Hardison for me, and I look forward to hearing from
you -
Also, I've attached files for your testing
Mark
Mark Bieda, Ph.D.
UC-Davis Genome Center Postdoctoral Fellow
Farnham Lab
ADD COMMENT
• link
•
modified 12.1 years ago
by
Nate Coraor ♦ 3.2k
•
written
12.6 years ago by
Mark Bieda • 10