How to I add a genome to Database/Build?

Question: How to I add a genome to Database/Build?

21 days ago by

amelidou • 0 wrote:

Hi! After pileup file generation and convertion of pileup to interval, potentially we could extract the consensus using the extract genomic DNA option. But it asks you to first tag the sequence using one of the existing Database/Build options which are quite limited. How can you add more? Is there another way to extract the consensus from a pileup file? Thank you! Angeliki

database galaxy pileup samtools • 70 views

ADD COMMENT • link •

modified 10 days ago by funnyjokes1.com • 0 • written 21 days ago by amelidou • 0

20 days ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hello,

The 10 column pileup format includes the consensus in the 9th field. You could use a tool like "Tabular-to-Fasta" to extract it directly.

If you use the positional coordinates against the custom reference genome used to initially map with, and pull out that sequence, it is possible, but be aware that the sequence won't include any of the variations from your read data. If that is your goal (maybe you want to have both), the Extract tool can be used with the option to get the reference genome fasta data from the history. No need to assign the database with that usage.

Tool: Extract Genomic DNA
Option: Choose the source for the reference genome
- Setting: From the history
Option: Using reference genome
- Setting: Pick your custom genome fasta dataset

If you ever do need a custom genome's database assigned to a dataset (some tools do work better that way -- but not Extract), a custom genome can be promoted to a custom build. This creates a "database" specifically associated with your account that can be assigned the same as any other database (Pencil icon > Edit Attributes > first tab, genome/database selection > Save)

FAQ: https://galaxyproject.org/support/

Preparing and using a Custom Reference Genome or Build https://galaxyproject.org/learn/custom-genomes/

Thanks! Jen, Galaxy team

ADD COMMENT • link modified 20 days ago • written 20 days ago by Jennifer Hillman Jackson ♦ 25k

20 days ago by

amelidou • 0

amelidou • 0 wrote:

Hi Jen, thank you for your answer! My column 9 contains various characters (like .,g, $) and if you try tabular to fasta it only creates different fasta files (not one consensus) with all these characters under each heading...Column 4 looks more like the sequence, but still it generates various fasta files with different headings (not just one consensus). Any ideas? Best regards, Angeliki

ADD COMMENT • link written 20 days ago by amelidou • 0

If you do not want the encoded consensus, try NGS: SAMtools > Pileup-to-Interval.

The consensus sequence from a pileup dataset is not the same as an assembly. These are short regions where variation was detected and reported.

ADD REPLY • link written 10 days ago by Jennifer Hillman Jackson ♦ 25k

Please log in to add an answer.

Similar posts • Search »