Question: >90% aligned concordantly 0 times ChIP-seq Bowtie2
1
gravatar for Gema Sanz Santos
9 weeks ago by
European Union
Gema Sanz Santos110 wrote:

Hi,

I know this question have raised many times here and in other forums but I've tried everything other people suggested in previous posts and still can't figure it out where is the problem...... This is the output summary of Bowtie2 (I checked several times the target genome, hg19 and it's correct):


21404130 reads; of these: 21404130 (100.00%) were paired; of these: 21196512 (99.03%) aligned concordantly 0 times 104527 (0.49%) aligned concordantly exactly 1 time 103091 (0.48%) aligned concordantly >1 times ---- 21196512 pairs aligned 0 times concordantly or discordantly; of these: 42393024 mates make up the pairs; of these: 906397 (2.14%) aligned 0 times 28296440 (66.75%) aligned exactly 1 time 13190187 (31.11%) aligned >1 times 97.88% overall alignment rate


I have paired-end ChIP-seq data, 50 bp reads. These are my steps (in Galaxy):

1) I groomed the fastq files to get fastqsanger (I checked if it's correct: Input FASTQ quality scores type --> Sanger & Illumina 1.8+)

2) FastQC is ok for all samples, some adapter contamination.

3) I trimmed using either TrimGalore or Trimmomatic (I get the same result using both) using forward and reverse files for each sample. I checked and I didn't missed up the forward and reverse samples (I run Bowtie2 using first file 1 and file 2 and also using first file 2 and second file 1 and I get the same result...)

4) I mapped the trimmed files to hg19 genome using Bowtie2 with option -fr and I get the result shown above. No matter what I try, I always get the same with the different samples that I have. All of them have around 32% of reads with adapters which I trimmed. I have no idea what to do with this.

Any suggestions or idea of where is the mistake? Am I missing something?

Thank you very much in advance.

Gema

alignment bowtie chip-seq • 149 views
ADD COMMENTlink modified 9 weeks ago • written 9 weeks ago by Gema Sanz Santos110
2
gravatar for Gema Sanz Santos
9 weeks ago by
European Union
Gema Sanz Santos110 wrote:

Thank you for your help!

Yes, now I'm quite sure the data is single-end. There was a confusion with the file names... they were 2 technical replicates of each samples but they were named as if they were paired and then I got confused.

ADD COMMENTlink written 9 weeks ago by Gema Sanz Santos110
1
gravatar for Jennifer Hillman Jackson
9 weeks ago by
United States
Jennifer Hillman Jackson22k wrote:

Hello,

The data formatting didn't come through intact (when pasting directly to this forum, use "code sample" text format - or link in from a site that preserves format, such as a Gist). Still, I was able to modify the data and run the tool FastQC to ensure the data is in fastqsanger format (it is). It shouldn't cause this exact problem if not in fastqsanger format, rather the overall mapping rates would be low, but it is good to eliminate that as a contributing issue. https://galaxyproject.org/support/fastqsanger/

The problem with obtaining recognized F/R pairs is most likely because both inputs are annotated as being either single-end sequencing results or both are the forward reads of a pair-end sequencing pair. https://galaxyproject.org/tutorials/ngs/#paired-end-data

Paired reads would have identifiers like this:

@HWI-ST1018:141:H0HVBADXX:1:1101:1235:1969 1:N:0:CAGATC 
@HWI-ST1018:141:H0HVBADXX:1:1101:1235:1969 2:N:0:CAGATC

or sometimes

@HWI-ST1018:141:H0HVBADXX:1:1101:1235:1969 1:N:0:CAGATC 
@HWI-ST1018:141:H0HVBADXX:1:1101:1235:1969 3:N:0:CAGATC

In short, the data does not appear to be a paired. If you believe that it should be, contacting the sequencing center or data provider that generated the data is advised. They can clarify the actual contents and library preparation/sequencing methods.

Sorting fastq data prior to mapping is usually not needed, but if you want to, this is how (although sorting the input fastq alone will not resolve your issue with paired-end data recognition): https://biostar.usegalaxy.org/p/8245/

Sorting downstream in an analysis is often important (it depends on the tool), this is how: https://galaxyproject.org/support/sort-your-inputs/

Thanks! Jen, Galaxy team

ADD COMMENTlink modified 9 weeks ago • written 9 weeks ago by Jennifer Hillman Jackson22k
0
gravatar for Jennifer Hillman Jackson
9 weeks ago by
United States
Jennifer Hillman Jackson22k wrote:

Hello,

Are you certain that the library prep produced '-fr' read orientation? That could be the problem if F/R not a correct match for the data.

Related help: https://galaxyproject.org/tutorials/ngs/

Thanks, Jen, Galaxy team

ADD COMMENTlink modified 9 weeks ago • written 9 weeks ago by Jennifer Hillman Jackson22k
0
gravatar for Gema Sanz Santos
9 weeks ago by
European Union
Gema Sanz Santos110 wrote:

Hi Jen, thank you for your answer.

This is an example of the first reads from TrimGalore output files 1 and 2 (as shown in Galaxy with the "eye" button):

File1:

@HWI-ST1018:141:H0HVBADXX:1:1101:1235:1969 1:N:0:CAGATC CCCANGATCTGTTCCACAGGAGATAAGCAGATCTTACTCCAGAGACCACTG + BB<f#0<b<fbffbfbbbff<b&lt;<ffffiibfffbf<fbbb<fb<ffiff7< p="">

@HWI-ST1018:141:H0HVBADXX:1:1101:1134:1970 1:N:0:CAGATC TGGCNTATGAAGTTCAGTGTTCTTTGGCTTGTTAGTCAGAACTGTTGC + BBBF#0BFFFFFFIFFFIIIIIFIIIFFIIIIIIIIIIIIIIFIIIII

@HWI-ST1018:141:H0HVBADXX:1:1101:1153:1984 1:N:0:CAGATC ACCTCTGTCTCCCAGGTTCAAGCGATTCTCCTGCCTCAGCCTCCCGAGT + BBBFFF<0B<b<fbfb&lt;0<ffb0fbb70bbbb0bff&lt;<bb<bfbfbfbb< p="">

File2:

@HWI-ST1018:141:H0HVBADXX:2:1101:1184:1971 1:N:0:CAGATC CTGATCAGAGGAGGAACATGACTAATCTATGGGCAGCCTACACTGAAGGC + BBBFFFFFFFFFFIIIIIIIIIIIIIFIIIIIIIIIIIIIIIIIIIIIII

@HWI-ST1018:141:H0HVBADXX:2:1101:1423:1966 1:N:0:CAGATC AATGGGTAGGTAAATGGATGGCTGGGTGAATGGATGGGTGGGTGGATTGGC + BBBFFFFFFFFFFIIIIFFIIFIIIIFFFIIIIFFFIIBFFIBFFFIIII7

@HWI-ST1018:141:H0HVBADXX:2:1101:1596:1990 1:N:0:CAGATC GATATCTTTTGTTTGTAGATATCTTTTCTAAGGCCCACATTCAGTGCAGAC + BBBFFBFFFFBBBBF<bfiiiiiiiiiiiifiifffiifiiiff0b<f7ff< p="">

Now I realize... it seems that after TrimGalore, the reads are not sorted?? TrimGalore is supposed to sort the reads right?

ADD COMMENTlink modified 9 weeks ago • written 9 weeks ago by Gema Sanz Santos110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 114 users visited in the last hour