FastQC Kmer in centre of sequence, quality trim?

Heads up! This is a static archive of our support site. Please go to help.galaxyproject.org if you want to reach the Galaxy community. If you want to search this archive visit the Galaxy Hub search

Latest

Open

RNA-Seq

ChIP-Seq

SNP

Assembly

Forum

Home

Welcome to Galaxy Biostar! User support for Galaxy! about • faq • rss

Log In

Sign Up

Question: FastQC Kmer in centre of sequence, quality trim?

0

2.2 years ago by

reubenmcgregor88 • 50

reubenmcgregor88 • 50 wrote:

I have run a FastQC on some fastqc sequences and the report tells me I have contaminating Kmers. Normally this would be fine as I could quality trim to get rid of them before downstream analysis but these seem to be present at position 10-18 in my sequence. See [1] below.

Does anyone know a)why this would happen and b)how they can be or even if they should be removed before mapping etc?

Thanks so much

Result of FastQC

fastq fastqc kmer quality galaxy • 1.3k views

ADD COMMENT • link •

modified 2.2 years ago by Jennifer Hillman Jackson ♦ 25k • written 2.2 years ago by reubenmcgregor88 • 50

1

2.2 years ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hello,

FastQC performs the analysis on a sample of the data (first 200k reads). The reads are pretty short to begin with and trimming these out would result in very short reads that could be difficult to map. So, it is probably OK to not worry about or remove these embedded Kmer regions.

I would instead focus on detected adaptor sequence in the FastQC report. It is possible that adaptors are present in the first 10 bases and the kmers follow those, representing some other artifact from library prep. However, since this would consume so much of the read (about half), these reads would fall out during the alignment step anyway, even if trimmed, especially if the data is RNA and a spliced alignment tool like Tophat/HISAT is used. 15 bases is pretty short to map unless it is DNA and the mapper (BWA, etc) has the parameters tuned well.

Others are still welcome to offer their options on this.

Best, Jen, Galaxy team

ADD COMMENT • link written 2.2 years ago by Jennifer Hillman Jackson ♦ 25k

Hello Jen,

Thanks for the quick reply. Sorry I should have mentioned it is a ChIP-Seq experiment. So you would suggest just aligning the reads anyway after adaptor trimming? I realise it is hard to say without knowing more info etc. But I am fairly new to this so any advice is very welcome :)

Thanks

Reuben

ADD REPLY • link written 2.2 years ago by reubenmcgregor88 • 50

1

OK - so DNA. I would go forward and give mapping the sequences a try. Thanks! Jen

ADD REPLY • link written 2.2 years ago by Jennifer Hillman Jackson ♦ 25k

Please log in to add an answer.

Similar posts • Search »

FastQC doesn't work after Splitter
Hello, I am working with paired-end reads, in separate files, and as far as I understood these s...
FASTQC for RNA-Seq data in Galaxy
Hello, I have 3 paired end fastaq file and when i loaded to Galaxy through EBI-SRA, it shows XX...
FastQ sanger sequences -ends quality trimming
Dear all , I'm very naive regarding bioinformatic tools. I need to trim my sanger sequences to re...
Reads with failing Kmer enrichment in FastQC - priming bias?
Hi there! I am performing a quality check in a transcriptomic dataset before attempting a de-nov...
remove overrepresented (contaminated) sequence
Hello, Based on FastQC result, I have two overrepresented sequence. One is my TruSeq Adapter s...
Considerations for trimming poor quality Illumina MiSeq paired end reads
I have generated fastq files from Illumina MiSeq for bacterial genome sequencing. My reads are 2...
Inquiry On Fastqc Report
Dear Galaxy Officer, Good day. I am a new user of Galaxy main server. The tools provided are ve...
trimmomatic with paired-end
Dear Biostar, I have paired-end sequencing data I would like to trim using trimmomatic. I have r...
FASTQ trimmer trimmed 1 nt too much of sequence but right amount of quality
Hi, I added two fastq files (I had just changed the datatype to fastqcsanger) to FASTQ trimme...
Way to trim all NGS nucleotide sequences to a specific section
Is there a way to trim the nucleotide sequence to a specific region? I know you can by base numbe...
Analysing paired end RNAseq data with poor quality score
Hello, I'm new in NGS and i'm currently working on RNA-seq. I had 6G of illumina reads and proce...
trimming adapter sequences. is it necessary?
Hi, I am a beginner to chip-seq analysis using Galaxy. I have received my data and did fastqc qua...
Re: Quality Based Trimming
Thanks for your reply Daniel. That's right... I did not even think about using the boxplot tool t...
How To Trim Sequences With Low Quality Scores
Hi guys, I did quality control on my RNA-seq data using FastQC. In the report for Per base seque...
illumina PE quality control - NGS: QC & Manipulation
Hi, I have a PE illumina miseq data set (separate forward-R1 & reverse-R2) of a WGS of a par...
GEO SRA fastq-dump with very low mapping rate (Galaxy)
Dear Biostars, I am a quite unexperienced biologist doing a metaanalysis of RNA-seq/microarray e...
Getting Figures from R with Galaxy DESEQ2 output?
Hi, So this summer I ran a bioinformatics workflow on differential gene expression in three samp...
Quality metrics of ChIP-seq BAM files??
Hi all, I'm a real beginner in the field of ChIP-seq analysis and bioinformatics in general (so ...
fastQC results of WGS data
Hello everyone. I got a few fastQC results of WGS data recently,all results have the common prob...

Content

Help

About
FAQ

Access

RSS
Stats
API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by Biostar version 16.09

Traffic: 172 users visited in the last hour