Question: One Question About The Genome Coordinates When Using Fetch Sequences
gravatar for Sean
7.7 years ago by
Sean10 wrote:
Hi, I have one stupid question. The coordinates of the region chr1 2351533 - 2351843 from UCSC (hg18) will retrieve 311 bases. However, when I use Fetch Sequences from galaxy, it will only retrieves 310 bases. Apparently, the first base of the 311 bases is missing from the Fetch Sequences result because the ending bases are the same. Does this mean that I need to modified the coordinates first and then use the Fetch Sequences to get the correct sequence? I thought UCSC and galaxy were both 0 base? Thanks. Sean
galaxy • 739 views
ADD COMMENTlink modified 7.7 years ago by Jennifer Hillman Jackson25k • written 7.7 years ago by Sean10
gravatar for Jennifer Hillman Jackson
7.7 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hello, The coordinates are interpreted in Galaxy as having a 0-based start. This means that in order to determine the actual start genome position, add 1. Not a stupid question - everyone has to learn this as they begin to work with data types sourced originally from UCSC and associated projects. Depending on which tool you are using in the UCSC database, the coordinates will be interpreted as 0-based or 1-based. What tools outside of UCSC or Galaxy do with the coordinates can vary. In general: positional coordinates of format "chrA:NNN-NNNN" will be 1-based BED/Interval format will be 0-based More help is on the "Convert formats" tool descriptions (included in BED format description). And, this link at UCSC has all of the details: Hopefully this helps! Best, Jen Galaxy team -- Jennifer Jackson
ADD COMMENTlink written 7.7 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 175 users visited in the last hour