Question: (Closed) Problem with Percent duplication on MARK DUPLICATES
gravatar for julieta
4 months ago by
julieta10 wrote:

Hello. I have a Fastq file with 30,000 reads. When I analyze them with Fastqc the level of duplicates goes very high. Then I trim the readings to eliminate low quality bases and the levels of duplicates drop to normal levels. I suppose this will be because when I change the length of the readings the program doesn't consider them as duplicates. Then I map the readings with Bowtie2 and I pass the BAM file to MarkDuplicates. Almost all the sequences are mapped but PERCENT DUPLICATION go back up to 80%. How can this be? Thank you.

ADD COMMENTlink modified 4 months ago by Jennifer Hillman Jackson25k • written 4 months ago by julieta10

Hello joseludenia!

Questions similar to yours can already be found at:

We have closed your question to allow us to keep similar content in the same thread.

If you disagree with this please tell us why in a reply below. We'll be happy to talk about it.


PS: Please see the original reply. Plus, I think you have the answer in this question. FastQC is not finding read-level duplicates. Once mapped, optical duplicates are revealed.
ADD REPLYlink written 4 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.
The thread is closed. No new answers may be added.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 169 users visited in the last hour