This is a great idea that the team has been considering adding, but
nothing immediate is planned. There are some external teams that are
working on outside development, and this is on their list, to.
If interested in what that project is doing, please see this thread:
For now, if the data resides in a track at UCSC (many are, especially
for vertebrate genomes and it is updated daily), using the Table
can allow you to export the data in GTF and push to Galaxy with the
Data" tool. Since some of the data can be large, using BX Main (our
local UCSC mirror) may be the best source.
To do this, navigate to the target genome and track (RefSeq under Gene
Predictions, others under Mrna & EST), and choose output format "GTF -
gene transfer format". Please note that the "gene_id" attribute in the
9th field will not be populated with the gene name (will be same as
transcript_id). This is just how UCSC does it right now (on their list
to get the full GTF output set up in the TB, as far as we know). But,
get that info now, go back in and reexport the same table data again
"all fields from selected table" into Galaxy and the gene name will be
in the data field named "name2". The text manipulation tools can help
format the data.
A workflow would be a good option once you have the tool path worked
out, so that it can be reused without having to do it all again, for
future similar genbank datasets. You may even want to publish the
workflow for others to use, as it is very popular request, maybe add
published page to explain how to use/prep data for input.
Apologies for the current inconvenience, but hopefully this can get
going until a more direct method is implemented directly in Galaxy
Great idea that many other users are also very interested in. Any
contributions (page, workflow) would be most welcomed. A tool that
the extraction directly from Genbank would also be welcomed in the
Shed, if you want to contribute.