Question: Running MACS2 without a control sample
0
gravatar for jasonanthonywhite
10 months ago by
jasonanthonywhite20 wrote:

While going through the Analysis of Chip-Seq data tutorial, I realized that one of my input controls has significantly less sequencing coverage (fingerprint plot link - https://imgur.com/a/y8yXo) than its experimental counterpart. The control sample has 1/4 the mapped reads of the treated sample (4 million vs 16 million). My concerns are as follows:

  1. In spite of the shallow coverage, is the control sample still usable? I would think it has some value but I am not sure if I am missing something.

  2. MACS documentation mentions that the inclusion of a control sample is optional but I have not been able to definitively determine the impact of not using a control on the results. It would seem that the use of the control would help to limit false positives. Is that accurate?

  3. With or without a control sample, wouldn't visualization of the sample BAM files and called peaks be an adequate way to validate the accuracy of MACS2?

Thanks for the help!!

control macs2 chip-seq • 1.5k views
ADD COMMENTlink modified 10 months ago by Jennifer Hillman Jackson25k • written 10 months ago by jasonanthonywhite20
1
gravatar for Jennifer Hillman Jackson
10 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

The coverage/depth can be normalized with MACS2 Callpeak directly. The settings on the tool form can be found under the section Advanced Options. MACS2 manual and support resources are here: https://github.com/taoliu/MACS/wiki. Prior Q&A and discussion about the value of controls in this type of analysis are at the MACS2 google group, covered in many publications, and are often topics at other general bioinformatics forums like https://biostars.org.

In short, the control provides a background the treatment is contrasted against to determine relative differences in signal.

  • A peak in a treated sample (no control) is a simple, isolated result. The signal might be important (unique to the treatment) - or not - the result does not describe which.
  • A peak, or lack of a peak, in a treated sample that differs versus the control, is a signal that has context. If the treated sample lacks a peak found in the normal sample (control), this is a result that can't be identified without a control included in the analysis.

For your questions (how-to for each are mostly covered in the tutorial you link, or in others linked below):

  1. Double check the mapping runs (all, not just for the control/controls). Review the FastQC report for sequence quality, do QA/QC as needed, review the settings/manual for mapping tool used and try out small changes to see if the rates can be improved.
  2. Run the analysis with and without a control and compare to determine how different the results are given your exact inputs.
  3. It is certainly a good idea to visualize results to sanity check quality. Other tracks can be brought in to add even more context: location of other genomic features, other ChIP-seq results (example: ENCODE).

Galaxy tutorials: https://galaxyproject.org/learn/

Support FAQs: https://galaxyproject.org/support/

Thanks! Jen, Galaxy team

ADD COMMENTlink modified 10 months ago • written 10 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 171 users visited in the last hour