I think that you are asking about the locally cached reference genome
builds - also sometimes called 'built-in'? If so, these wikis have
instructions to help you create, organize, and obtain these data (if
want copies of what we host on Main).
Reference genome builds can seem to have several "names" given to them
but if you look closely, you will note that each contains the most
important identifier, the "dbkey". This short tag "dbkey" is what is
seen as assigned for the "database" metadata attribute in the UI.
sure that this is consistent and that location (e.g. ".loc" files)
contain the correct key and are in the correct format will move you in
the right direction. Watch for: tabs for white space between fields,
extra white space, no extra lines, dbkey value used are the same as in
the reference and index files names, etc. - as described in headers of
.loc files, the wikis below, or even follow what we have in ours from
the rsync server.
- The most important parts of this wiki to note are the
file (this is the list of genomes in the pull down menu) and then the
rsync server instructions (this is where you can obtain copies of the
data used on Main)
- Organizing data and building indexes for various tools.
Adding "Custom Reference Genomes" through the UI - as done by
users - is a different process, but I don't think you are asking about
that. The internals for that are not in a wiki, but the processing is
automatic and much is done on the fly as tools execute, when a fasta
file from the history is selected as the target reference build. (Same
indexes, just not saved beyond the genome being loaded as when added
using: User -> Custom Builds, good for using Trackster).
Hopefully this will help you get started.