19 months ago by
United States
Hello,
Not in one step for the different input types, that I am aware of.
Some options:
The mapping tools for NGS data are designed to map NGS reads against a single reference fasta file of data (typically a genome, transcriptome, exome). Each reference fasta file, aka dataset, could be used as Custom genome if not already indexed on the Galaxy server in use.
This isn't exactly what you described as the goal, but two reference sequences could be combined into a single "meta" reference dataset and the NGS reads aligned against that as a custom reference genome.
Note that both 1 & 2 would not provide homology information between the two reference sequences themselves - instead just the NGS reads verses each, distinctly. But those two sequences could be compared to each other using difference tools. Which depends on what kind of data those represent.
- The NGS reads could be assembled to produce a consensus fasta dataset (in effect a "third" reference datasets) and then all three compared to produce a MAF result (Multiple Alignment Format). If you want to do that, the content of the reference sequences should be considered when selecting a tool. If you identify a tool that meets your needs and matches the data input types (from a publication, web search, etc) check to see if it is wrapped for Galaxy in the Tool Shed (http://usegalaxy.org/toolshed). Considerations when selecting a tool (there can be others):
- How successful was the NGS readsassembly?
- Is each "reference sequence" really just a single sequence?
- Or are there multiple sequences per distinct "reference dataset"?
- What do they represent (transcript(s), chromosomes(s), exons(s), other)?
- How long are the reference sequences?
- Two individual transcripts are relatively easy to compare.
- Two genomes, even smaller ones, or many versus many transcripts are both operations that are more complicated to do.
Custom genome help:
I hope that this helps! Jen, Galaxy team