Question: How to know if mapping to reference genome was succesful?
1
gravatar for eurioste
10 months ago by
eurioste40
eurioste40 wrote:

Given a set of BAM files for single-end reads just mapped with BWA tool, how can I have an overview if the mapping will result in useful information and the quality of the mapping was good? These data is to be used in a ChIP-seq experiment. Because this is a pilot experiment there are no technical duplicates, only one BAM for the input and for each ChIPed proteins/histone mark. Because of this a "Correlation among samples" plot will not be useful.

I'm new to NGS and ChIP-seq data and I tried to find useful tutorials/ references for galaxy but could not find one dealing with a dataset in a realistic manner.

Which tools and plots should I use? Any general workflow for assessing mapping quality would be appreciated.

EDIT: I would like to have something like the distribution of MAPQ values or the distribution of the reads along the chromosomes.

mapping bam chip-seq • 366 views
ADD COMMENTlink modified 10 months ago • written 10 months ago by eurioste40
2
gravatar for Jennifer Hillman Jackson
10 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

QA/QC is covered in the Galaxy tutorials here: https://galaxyproject.org/learn/

Try these to start with (for assessing sequencing and mapping quality):

Thanks! Jen, Galaxy team

ADD COMMENTlink written 10 months ago by Jennifer Hillman Jackson25k
1

Thanks for the answer but unfortunately these tutorials don't teach what I need. I've got got FastQC and pre-processing well covered and the tutorials only deal with deduplication and cross-sample correlations in post processing. I would like to have something like the distribution of MAPQ values or the distribuiton of the reads along the chromosomes.

ADD REPLYlink written 10 months ago by eurioste40

Hello,

This section related to bigWig coverage should help: https://galaxyproject.org/tutorials/chip/#generating-bigwig-datasets-for-display

And this tutorial has more help for assessing mapping success: http://galaxyproject.github.io/training-material/topics/chip-seq/tutorials/tal1-binding-site-identification/tutorial.html#step-7-inspection-of-peaks-and-aligned-data

Downstream tools can also be used to generate stats/metrics. Search for bigwig in the tool panel to find these - many are in the DeepTools tool group. Example: computeMatrix followed by plotHeatmap and/or plotProfile. More usage details are on the tool forms.

MAPQ stats can be pulled out into a tabular dataset and graphed. Convert BAM-to-SAM as needed (no header). Use Cut to extract the MAPQ value (5th column). This is now a tabular dataset that can be used with Charts (found under the Visualize icon, per dataset (the small graph icon) ). Charts can be used with any tabular data input - so any numerical value you are curious about could be isolated in a similar way and graphed.

Hope that helps with more options!

ADD REPLYlink modified 10 months ago • written 10 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 169 users visited in the last hour