remove overrepresented (contaminated) sequence

Heads up! This is a static archive of our support site. Please go to help.galaxyproject.org if you want to reach the Galaxy community. If you want to search this archive visit the Galaxy Hub search

Latest

Open

RNA-Seq

ChIP-Seq

SNP

Assembly

Forum

Home

Welcome to Galaxy Biostar! User support for Galaxy! about • faq • rss

Log In

Sign Up

Question: remove overrepresented (contaminated) sequence

0

2.7 years ago by

jieun.e.park • 0

jieun.e.park • 0 wrote:

Hello,

Based on FastQC result, I have two overrepresented sequence. One is my TruSeq Adapter sequence and another seems like a contaminated sequence (below) AAGCAGTGGTATCAACGCAGAGTACTTTTTTTTTTTTTTTTTTTTTTTTTT

I would like to remove/trim this contaminated sequence from my fastq file before mapping, but I don't know how to do it. I heard that this can be done with fastq-grep but not sure how to use grep tool to remove the contaminated sequence.

Thank you for your help in advance.

rna-seq • 2.6k views

ADD COMMENT • link •

modified 2.7 years ago by Jennifer Hillman Jackson ♦ 25k • written 2.7 years ago by jieun.e.park • 0

If you could give me detailed explanation (I am new to usegalaxy!) I would appreciate it so much!

ADD REPLY • link written 2.7 years ago by jieun.e.park • 0

0

2.7 years ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hello,

If most of the sequence is contaminate, then it will fall out during mapping and trimming/filtering is not really needed as a precursor step. But if you still want to, see the trimming tools in the group NGS: QC and manipulation. Try Trim Galore! or Trimmomatic. Documentation is on each tool's form.

Thanks, Jen, Galaxy team

ADD COMMENT • link written 2.7 years ago by Jennifer Hillman Jackson ♦ 25k

Please log in to add an answer.

Similar posts • Search »

Galaxy FastQC (v0.52) vs FastQC (v0.11.2) Overrep. sequences differences
Dear Biostars; To start, I love your FastQC wrapper. In fact, I found performance differences be...
Primer Contamination, Miranalyzer
Hi Galaxy, Ive got 2 problems for you; 1) Ive got microRNA Illumina NGS data that I want to ana...
Cip adapters with FASTX Galaxy
I am just trying my first NGS analysis using Galaxy. FASTQC showed contamination with adapter dim...
Getting all unmapped reads
I have a fastqsanger dataset from an RNAseq experiment, and I am trying to remove contaminating h...
FASTQC Overrepresented Sequences
Hey all, after running Trimmomatic and clipping Illumina adapters, I always run a FASTQC to have...
FastQC Kmer in centre of sequence, quality trim?
I have run a FastQC on some fastqc sequences and the report tells me I have contaminating Kmers. ...
How to interpret and control for effects of primer contamination?
Hi all, I'm currently analysing some RNAseq samples which were amplified prior to library prep an...
Is it important to perform Adapter sequence removal step?
Hello, Can I ask is it important to perform the adapter removal step, when I ran the RNA-seq fas...
Remove e. coli and vector contaminants from Ilumina reads
I ran Trinity (wheat 300 bp paired ends) using Ilumina reads. When I ran BLAST on the contigs, so...
Getting species/taxa from GIs
Hi there, I'm using megablast to find out whether my RNAseq data might contain contamination from...
Any tools for separate unpaired reads in paired-end sequencing fastq files?
Hi, I would like to know if there is any tool can do the following job? I have some data files ...
How to go back to sequences from MACS2 peaks
Hi, I am playing with my ChIP-seq results. I would like to check if I have any overrepresented m...
How To Trim Sequences With Low Quality Scores
Hi guys, I did quality control on my RNA-seq data using FastQC. In the report for Per base seque...
Removing a single base from 3' end of the reads using "Clip" option
Hello, I need to remove a single base i.e. C from the 3' end of all of my reads. I tried doing i...
Data from NextGen sequencing uploaded, concatenated, but not in appropriate FASTA format for next step?
Hello, I received my raw NextGen sequencing files and am following a lab mate's protocol based on...
Help removing part of a sequence
Hi, I am trying to trim out part of a sequence so I can align it better to a reference sequence....
How do i remove multiple adapter sequences from my RNAseq reads?
Hi there, I want to remove the universal adapters as well as the index adapters in each data fil...
Mapping Unmapped Sequences To Other Guessed Contaminants
Hi, I am a relatively new user to Galaxy. Of the 21 million mappable illumina reads, 17 million ...
How to compare two libraries
Dear Service, I load in Galaxy server about 10 histories. Each one is composed of a fastq file a...
Nucleotide Analysis - Gc Percentage
Hi all, Are there any built in Galaxy tools that I have missed to do with GC percentage (or inde...
Clip Adapter Sequence
Good morning, I am very new in using Galaxy. I would like to use Clip to remove the adapter sequ...
trimgalore on interleaved paired-end fastq files
Hi, I'm analyzing interleaved paired-end fastq files downloaded via fastq-dump. I am trying to r...
Inquiry On Fastqc Report
Dear Galaxy Officer, Good day. I am a new user of Galaxy main server. The tools provided are ve...

Content

Help

About
FAQ

Access

RSS
Stats
API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by Biostar version 16.09

Traffic: 173 users visited in the last hour