Question: Problem With Gmaj
0
gravatar for Ross Hardison
12.3 years ago by
Ross Hardison270
United States
Ross Hardison270 wrote:
I'm trying to use the GMAJ tool on Galaxy, but get this error: Error loading alignments from bundled file "input.maf": edu.psu.bx.gmaj.BadInputExceptions: Sequence length contradiction" s baboon.1 26891 63 + 1248010 TCAG... Current = 1248010. Previous = 550206 The maf files were extracted using the "Extract MAF blocks" tool, for TBA ENCODE alignments. The maf data are at http://main.g2.bx.psu.edu/display?id=3054
galaxy • 977 views
ADD COMMENTlink modified 12.3 years ago by James Taylor320 • written 12.3 years ago by Ross Hardison270
0
gravatar for James Taylor
12.3 years ago by
James Taylor320
United States
James Taylor320 wrote:
Ross, The problem here is that there are multiple sequences with the name 'baboon.1'. In fact, in the ENCODE TBA alignments there is a different 'baboon.1' for every ENCODE region, so when you concatenate MAFs across multiple regions you get a result that GMAJ (rightly) cannot understand. To correct this I think the only thing we can do is change the MAFs provided by the encode group so that every orthologous sequence has a unique name across encode regions. So baboon.1 -> baboon.ENm001_1, and so on. All, would such a change break any existing tools / analysis? -- jt
ADD COMMENTlink written 12.3 years ago by James Taylor320
No conflicts on my end. Actually, it would be a rather convenient (and accurate) annotation to refer to the aligned species by region. -David
ADD REPLYlink written 12.3 years ago by D C King10
This sounds like the correct solution, and I cannot think of any tools which would be broken by doing this. Dan
ADD REPLYlink written 12.3 years ago by Dan Blankenberg130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 173 users visited in the last hour