Using the tool "Get Data -> UCSC Main table browser", data can be
retrieved directly using either gene symbols or locus positions.
A good track to go against is "UCSC Genes", if available for your
genome. "RefSeq Genes" is another good choice. But really any track in
the group "Gene and Gene Prediction Tracks" is worth a look to see if
is fit for what you are interested in, as the content can vary between
genomes and even builds. The specifics can be reviewed at UCSC by
clicking into the "describe table schema" area (button next to "table"
selection, start with default table).
To search multiple gene symbols, enter the list in the form under
"identifiers". To search multiple loci, enter the list under "region"
(define regions). These both accept a text file, so download the
information, cut out of the original file, formatted how the UCSC form
states from Galaxy as text (tabular). Or, export as text from the
spreadsheet. 300 should be fine at once, I believe the limits are
1000 per query for each of these.
At this point in the query, the extract would just pull basic data
the single primary table. To also pull out related information, change
the "output file" type to be "selected fields from primary and related
tables" and then click on "get output".
The next form is where you can link in additional tables of data. The
general idea is to add the table, then select the specific fields that
you want to include. Again, any of these can be reviewed before the
final query is made using the first main form and then the "describe
table schema" button, or once in that describe view, by clicking on
related tables to navigate. When doing the query this way, the Table
browser takes care of the relational joins for you, just as an SQL
For more help about using the UCSC table browser, these links are good
places to start, and for detailed questions about a specific piece of
data that you cannot locate, the support team for the browser can
certainly help. The Table browser is not your only option (flat text
files and a mySQL database are available), but this is a web-based
access point to the information, easily imported into Galaxy or
downloaded for further analysis. There are also other types of queries
possible, at UCSC and in Galaxy, this is just the most direct I know
for your question and original data:
One note: you have the locus position with a chromosome identifier in
the format "Chr1" in your email. I am not sure if this was intentional
or not - but you will need to format the identifiers to match those in
the target reference genome, just as they were in the original
In general, this would mean the format would be "chrX" instead (case
matters). So, check/adjust the case/format to avoid problems, these
really do have to be an exact match. The same is true for gene
names/symbols - you can always search in the browser to see what the
format is if something is missing and adjust. Also make sure that
does not output any hidden characters (line wraps) - stick with plain
text cells for best results if you plan to output/use the data with
external tools. You probably know most of this, but just in case I
wanted to point out where the gotchas could be. Even if using gene
for this, you may want to use the position later on, and identifiers
the correct format from the start are a good idea.
Hopefully this gets you started!