Hello, I am very new to sequence data analysis and had some structural questions. I am trying to analyze the difference in the presence of an epigenetic mark between a control group and a treatment group. The animals model we use are mice. I have 3 biological replicates from each of the two groups and I am trying to find the different in the presence of this mark. I'm good up to the point of alignment, I've aligned all 6 samples to the mouse genome using bowtie on galaxy. but I'm stuck on how to peak call. I know people usually use an control/input sample for chip-seq where they don't do the IP and just sequence to account for the background noise but we didn't do a non IP control. Here's are my thoughts on how to approach this and the options I came across. Please let me know which one is the most reasonable approach, it would be wonderful if there are references to papers or protocols.
option 1: All 6 mice are age matched and same sex. Use the 3 control mice and randomly assign them to the 3 treatment mice and peak call using Control mice as input. I would end up with 3 files of different peaks, then I would find the peaks present in all 3 files of differential peaks. I'm not sure which tool to use, maybe "intersect" under "operate on genome variables".
option 2: combine the mapped reads into 1 file for each of the control and treatment group so I will have two files one control and one treatment. I'm not sure not to combine them in galaxy, there are a few options to concatenate, join or merge, any suggestions to which to use would be helpful. Then i would peak call using these two files using control as input and treatment as target.
Any suggestions would be helpful, if there are galaxy protocols, that would also be extremely great. Thank you.