I have aligned fastq sequences (from an insect vector) to a small bacterial genome of 1.2 mb (as text file) using Bowtie2 in Galaxy. The aligned sequences are in about 15 contigs with short gaps. How do I extract the consensus sequences as fasta contigs from a BAM or a SAM file?
Hello,
The mapped BAM dataset will only contain the original reads, not assembled results based on overlap/other factors. I don't think this is what you are asking about, but those reads can be output again using the tool NGS: Picard > SamToFastq extract reads and qualities from SAM/BAM dataset and convert to fastq. Some mapping tools also include an option to output optional fastq datasets containing mapped vs unmapped reads.
To assemble those mapped reads into consensus sequences, please see the Galaxy tutorials here: https://galaxyproject.org/learn/
Prior Q&A might also help, review/search the posts in the right side bar >> or search all Galaxy resources here: https://galaxyproject.org/search/?q#gsc.tab=0
I am not sure if your data is RNA or DNA, but either way, one of the Galaxy tutorials probably fits what you want to do. If you cannot find a match, please explain more about your data content/analysis goals and we can help more.
Thanks, Jen, Galaxy team