Question: remove overrepresented (contaminated) sequence
2.7 years ago by
jieun.e.park0 wrote:


Based on FastQC result, I have two overrepresented sequence. One is my TruSeq Adapter sequence and another seems like a contaminated sequence (below) AAGCAGTGGTATCAACGCAGAGTACTTTTTTTTTTTTTTTTTTTTTTTTTT

I would like to remove/trim this contaminated sequence from my fastq file before mapping, but I don't know how to do it. I heard that this can be done with fastq-grep but not sure how to use grep tool to remove the contaminated sequence.

Thank you for your help in advance.

rna-seq • 2.6k views
If you could give me detailed explanation (I am new to usegalaxy!) I would appreciate it so much!

2.7 years ago by
United States
Jennifer Hillman Jackson25k wrote:


If most of the sequence is contaminate, then it will fall out during mapping and trimming/filtering is not really needed as a precursor step. But if you still want to, see the trimming tools in the group NGS: QC and manipulation. Try Trim Galore! or Trimmomatic. Documentation is on each tool's form.

Thanks, Jen, Galaxy team

