Question: FASTQC Overrepresented Sequences
gravatar for marvin.holz
4 months ago by
marvin.holz0 wrote:

Hey all,

after running Trimmomatic and clipping Illumina adapters, I always run a FASTQC to have a look at the quality of my data.

This time I received for 80 % of my samples the info that there are overrepresented sequences. I blasted them, they are no adapters but pre-40s rRNA and mitochondria sequences. Their abundance is around 0.17 %.

My question is, do I have to remove these sequences from my RNA Seq data before calculating differentially expressed genes? If yes, do I have to remove them from all samples, even in those where they are not highlighted as overrepresented?

Thanks for any help!

ADD COMMENTlink modified 4 months ago by Jennifer Hillman Jackson25k • written 4 months ago by marvin.holz0
gravatar for Jennifer Hillman Jackson
4 months ago by
United States
Jennifer Hillman Jackson25k wrote:


Much higher levels of rRNA sequence can indicate that something went wrong during library preparation, however, your rate is pretty low if only around 0.17 % abundance. 17% abundance would be much more significate.

Duplications in this module can be associated with contamination but not always, and low rates shouldn't impact an analysis (most public annotation sources do not include rRNA, so these reads will drop out in latter steps). For more details, please review:

Thanks! Jen, Galaxy team

ADD COMMENTlink written 4 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 172 users visited in the last hour