Question: S.cerevisiae reference genome from USCS--2bit extraction issue
3.5 years ago by
United States
sv112 wrote:

I extracted the SacCer3 (S.cerevisiae) reference genome sequence from the corresponding 2bit file that I downloaded from USCS, via the TwobitTofa utility. However, I only see the sequence of chromosome I when I open the file. What is the specific command to convert all the data in the 2bit file into FASTA, and not just a part of the data?




3.5 years ago by
United States
Jennifer Hillman Jackson wrote:


This utility has full fasta extraction as the default usage. If you are having problems with it or a particular .2bit file from the UCSC downloads server, I suggest contacting the UCSC Genome Browser team (

The command-usage can be obtained by just running the command without arguments or by reviewing the tool list on the UCSC downloads server under the "Source" section. I am also including it below for reference.

Take care, Jen, Galaxy team


$ twoBitToFa
twoBitToFa - Convert all or part of .2bit file to fasta
   twoBitToFa input.2bit output.fa
   -seq=name - restrict this to just one sequence
   -start=X  - start at given position in sequence (zero-based)
   -end=X - end at given position in sequence (non-inclusive)
   -seqList=file - file containing list of the desired sequence names 
                    in the format seqSpec[:start-end], e.g. chr1 or chr1:0-189
                    where coordinates are half-open zero-based, i.e. [start,end)
   -noMask - convert sequence to all upper case

Sequence and range may also be specified as part of the input
file name using the syntax:
Update: As a test, I just rsync'd the sacCer3.2bit file from UCSC and then ran twoBitToFa. All of the content was as expected. My commands were:

$ rsync -avzP rsync:// .
$ twoBitToFa sacCer3.2bit sacCer3.fa
3.5 years ago by
United States
sv112 wrote:

That worked perfectly, thanks!



