Question: How to combine fasta datasets within Galaxy
gravatar for chalagarciad
4 months ago by
chalagarciad0 wrote:


I am trying to make a combined genome. I need it to be human genome19+EBV genome. I have been unable to do it by opening it and copying the two together because of the large size of the human genome.I wanted to see if anyone knows of a Galaxy tool that could help me put together these genomes, or of any other way to do it outside of Galaxy.

I greatly appreciate any help!

ADD COMMENTlink modified 4 months ago by Jennifer Hillman Jackson22k • written 4 months ago by chalagarciad0
gravatar for Jennifer Hillman Jackson
4 months ago by
United States
Jennifer Hillman Jackson22k wrote:


This is how to combine two or more fasta datasets. In your case, there are two datasets where each represents a genome.

  1. Upload the datasets to Galaxy in fasta format.
    • Use FTP upload if the data is over 2 GB (it probably will be).
  2. Run the tool Normalize Fasta on each. This standardizes the format.
    • Use the option to wrap at 80 bases and to trim the title line at the first whitespace
    • It is important to make sure that each identifier (the ">" line's first "word") is unique for any datasets that you wish to combine
  3. Combined the two (or more) normalized fasta datasets into one with the tool Concatenate
    • The datasets are "stacked" into a single dataset
    • Any number of plain text datasets of the same datatype (no headers or comment lines, or these removed first) can be combined with this same tool, it is not just for fasta format.

This will effectively create a fasta dataset that can be used as a Custom Reference Genome and optionally a Custom Build. If you have trouble with this or want more details, please start by reviewing the guide here, then let us know if anything is unclear:

Hope this helps! Jen, Galaxy team

ADD COMMENTlink modified 4 months ago • written 4 months ago by Jennifer Hillman Jackson22k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 91 users visited in the last hour