Picard Bam Statistic

6.3 years ago by

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hello, This means that your BAM file contains only aligned data. Running another tool that counts up similar statistics, such as "NGS: SAM Tools -> flagstat" could be used to confirm the counts. Best, Jen Galaxy team -- Jennifer Jackson http://galaxyproject.org

ADD COMMENT • link written 6.3 years ago by Jennifer Hillman Jackson ♦ 25k

Hi all, We have recieved some bam files that were aligned to hg18. What would be the easiest workflow to get a VCF file from GATK in build hg19 ? We are running a local galaxy with only hg19 for the moment. Lifting the bam file would be my first choice, providing support to hg18 by generating all indices would be my last :-) Best regards, Geert

ADD REPLY • link written 6.3 years ago by Geert Vandeweyer • 30

Hello Geert, For the best results, especially for SNPs, you will want to map directly to the target genome. The genome Galaxy is using is the same primary human genome the GATK team also uses - the 1000 genomes build 37 -> "hg_g1k_v37". Click on the GATK links from one of the tools to see the details. GATK provides liftOver files between the the genomes, and you could install and use these with the liftOver tool, but not for BAM datasets. Inputs are BED, Interval, GFF. (BAM -> SAM -> interval). GATK also provides indexes (lifted) for hg19, but Galaxy does not provide an hg19 genome that is sorted appropriately for GATK, or at least not yet. RNA-seq tools and most other tools up until now required sorting in one way, and now GATK requires sorting in another, but keeping the database dbkey the same is important for visualization and other functions. It can get complicated when moving between tools in a history. We will likely have some 'best practice' solutions soon, but for now, use the 1000 genomes build to keep it all simple: Human (Homo sapients) (b37): hg_g1k_v37 The good news is that installing this genome has been greatly simplified. The genome and indexes are now available on an rsync server. You can simply download and add the genome directory and all the contents. You will still need to create the .loc file entries but the rest is done. http://wiki.g2.bx.psu.edu/Admin/Data%20Integration The "dbkey" is "hg_g1k_v37" Hopefully one of the options works out for you! Jen Galaxy team ps: You post ended up threading behind another post. I am not sure if this was because you started with a reply, but changed the subject line? This is not enough to start a new thread. Instead, please create a brand new message in your email client, then copy over the mailing list email address, add a subject line, and this will start a new thread that will get tracked and not missed. Thanks! -- Jennifer Jackson http://galaxyproject.org

ADD REPLY • link written 6.3 years ago by Jennifer Hillman Jackson ♦ 25k

Similar posts • Search »