Question: Differential comparison of Chip-Seq/SICER data
0
gravatar for michael.alexander.finger
3.4 years ago by

Hello all. I am looking for some help with a computational genetics task in Galaxy, but I am new to bioinformatics and practical genetics, so please forgive me if my questions seem elementary or in need of clarification.

I am trying to figure out a way to look at quantitative differences between the presence of two DNA-binding proteins. In the lab I an interning for, we are looking at the difference in trimethylation of histone 3 proteins under two different conditions, in order to understand changes in transcriptional regulation.

For simplicity, I have four chip-seq files: one for H3, and one for K4 (ie trimethylated H3), for each condition. I have mapped those (using an input chip-seq and BWA) to my reference genome, converted them to BED files, and then
run those BED files against the input using SICER to find peaks showing where these histones are on in the genome.

After this point, I am lost. In general, H3 and K4 are going to show up in the same spots in the genome, so the resulting BED files telling me where they are without the magnitudes aren't very helpful. What I want is a measure of the K4/H3 ratios for each location, so that I can compare the same loci from both conditions and see if there is a change in how much a certain H3 is methylated.

I think the wig files produced by sicer have some magnitude data (number of counts), but I cannot figure out if there is a way to compare two wig files and perform operations (such as subtracting values from one wig file from another). Interval operations require that they be converted to BED, which causes them to lose the count data anyway.

So in summary, I need to:

A. Use Sicer to find peaks for H3 and K4 sequences

B. Generate a file that gives the K4/H3 ratio at each peak for both conditions

C. Find the difference in those ratios between both conditions at each peak, ideally so that I can generate another

BED file only at peaks where the difference meets a certain threshold.

Is there someone that knows how I might go about this?

sicer galaxy chip-seq • 1.5k views
ADD COMMENTlink modified 3.3 years ago by Jennifer Hillman Jackson25k • written 3.4 years ago by michael.alexander.finger0
0
gravatar for Jennifer Hillman Jackson
3.3 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

Have you tried converting the wiggle files to interval with the tool "Wiggle-to-Interval"? This will preserve the scores associated with the defined peak regions. 

From there, the genomic interval tools can be used to merge the datasets, to find overlaps, preserving the scores. Then the scores can be compared, filtered, and such with tools in Text Manipulation and Fiter and Sort.

Hopefully this helps, Jen, Galaxy team

ADD COMMENTlink written 3.3 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 169 users visited in the last hour