Question: Population Genetics analysis help
gravatar for trblo17
6 months ago by
trblo170 wrote:

Hello, I'm trying to do a bioinformatics analysis for my senior undergrad project. I have 3GB worth of genetic data that I have ran through the following steps: FASTQC, TRIMGALORE, Map with BWA, Filter BWA, RmDUP, and FASTQC again. I'm trying to now run some sort of population analysis such as the ones available in the genome diversity tools section. They all require a file format that I'm not familiar with. Is it possible to convert from a BAM file to one of those files? They all seem to be gd_ped or gd_snp. I'm just trying to check for variability among the population. Any help is greatly appreciated!

ADD COMMENTlink modified 6 months ago by Jennifer Hillman Jackson25k • written 6 months ago by trblo170
gravatar for Jennifer Hillman Jackson
6 months ago by
United States
Jennifer Hillman Jackson25k wrote:


Data could be parsed out of the BAM dataset then reformatted to the other file types but that is not part of the core workflow for this tool group (see the publication linked below for recommended methods).

There are no tools to convert BAM to gd_snp or gd_ped automatically, but I can point you to the file specifications and a few tools that can help build up some of them. The Genome Diversity tools also have formatting help directly on the tool forms. Tools in the group Text Manipulation can be used for many data parsing functions (convert to SAM format first), or you can do the data formatting line-command with your own methods and upload the prepared content to Galaxy after.

Example tools to create gd_* datasets (found within the Genome Diversity tool group):

  • Make File
  • Prepare Input
  • Convert

File format specifications:

The publication that describes the workflow:

Please be aware that these are older tools. These have no ongoing development, some have known issues, and there is limited programming support for fixes. Instead, you might want to consider more current methods. See the Galaxy tutorials, in particular those from the Galaxy Training Network (GTN) under "Analysis of Sequences". Not all of these tools will be available on Galaxy public servers, but there is a pre-configured Docker image available or the tools can be installed from the ToolShed into your own local, cloud, or docker Galaxy.

Thanks! Jen, Galaxy team

ADD COMMENTlink written 6 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 119 users visited in the last hour