when is repeatmasker used in data analysis pipelines?

Heads up! This is a static archive of our support site. Please go to help.galaxyproject.org if you want to reach the Galaxy community. If you want to search this archive visit the Galaxy Hub search

Latest

Open

RNA-Seq

ChIP-Seq

SNP

Assembly

Forum

Home

Welcome to Galaxy Biostar! User support for Galaxy! about • faq • rss

Log In

Sign Up

Question: when is repeatmasker used in data analysis pipelines?

0

13 months ago by

eforchielli • 0

eforchielli • 0 wrote:

Hi everyone,

I am totally new to sequencing data analysis, and I apologize if this question has been asked and answered ad nauseam... but I can't seem to figure this out.

I know that repeatmasker is commonly used to remove reads that contain repetitive elements from sequencing datasets. My question is, at what stage in an analysis pipeline is it commonly used, if at all? Would you apply repeatmasker in ChIP- and RNA-seq analysis or just during de novo genome assembly? Am I completely missing the point and correct usage of repeatmasker? I've read the repeatmasker documentation and have a sense for what it does, but I'm not sure when it's actually used.

I'm asking because I'm specifically interested in these discarded reads, and I'm not sure how to tell if certain existing datasets in public repositories are likely to have had this information removed.

Thanks for your help! Elena

rna-seq repeatmasker chip-seq • 367 views

ADD COMMENT • link •

modified 13 months ago by Jennifer Hillman Jackson ♦ 25k • written 13 months ago by eforchielli • 0

0

13 months ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hello,

Genome data sources often will use repeatmasker to soft mask (lower case bases) or hard mask (NNN replacement) nucleotide databases they release. The name of the file and/or the readme associated with the data should tell you if used (including which db choices) and how (sort or hard masking). Often sources will release several versions: unmasked, soft masked, hard masked.

If your genome is at UCSC, they have a RepeatMasker track for any genome with available rm databases.

For example Galaxy workflows, please see our tutorials here: https://galaxyproject.org/learn/

Thanks! Jen, Galaxy team

ADD COMMENT • link written 13 months ago by Jennifer Hillman Jackson ♦ 25k

Please log in to add an answer.

Similar posts • Search »

trimgalore on interleaved paired-end fastq files
Hi, I'm analyzing interleaved paired-end fastq files downloaded via fastq-dump. I am trying to r...
cuffdiff conditions related question
Hello, I am using Galaxy to analyze the RNA-seq data. I want to ask one question related to the ...
Critical Feedback
This student was more adventurous. I think he actually could do more of what he tried with more e...
Pipeline Design For Bacterial Rna-Seq
Hi community, I have just discovered a wonderful Galaxy server for RNA-Seq analysis: Oqtans http...
Unified Genotyper error
Hello, everyone: I am building workflow for SNV on whole genome sequences. I was following best-p...
Issue With Saving 'Manipulate Fastq' In Workflow; And Request For Advice Dealing With Barcoded 454 Data
Hi, I'm a new user, learning how to use Galaxy while I wait for my 454 results. So I'm not actua...
RNAseq data to be processed in two ways: (i) mapping to de novo Trinity-based transcriptome and (ii) mapping a relatively new genome
Hello all, I am new to RNAseq data and learning this process step by step, so I have a few quest...
Problem Uploading Via Ftp Server
Dear Galaxy, I am using Galaxy on my work for analysing a whole genome sequencing project. On t...
About rarefaction curve : FROGS pipeline
Hello everyone, I am doing a metagenomic analysis on Galaxy with the pipeline FROGS. Everything ...
Uses For Galaxy
Hi Galaxy Users, I'm very new to galaxy and have read/watched MANY galaxy tutorials but I have s...
Galaxy: Genomic Intervals to strict BED Conversion
Hello: I'm new to the forum and Galaxy/bioinformatics. In the last two months I've been followi...
removing of duplication in RNA seq
Hi I am new in bioinformatics and in Galaxy too so please bear with me Soon I will have RNA seq...
Primer Contamination, Miranalyzer
Hi Galaxy, Ive got 2 problems for you; 1) Ive got microRNA Illumina NGS data that I want to ana...
Chip Seq analysis with multiple biological replicates for differential expression
Hello, I am very new to sequence data analysis and had some structural questions. I am trying to ...
3' Adapter Trimming Using Fastx-Toolkit Clipper
Hi all, I am analyzing miRNA sequencing now. My data is 51bp, single -ended and ~5 M reads. I wan...
(No Subject)
I am new to the NGS analysis. I need help to solve this problem. As shown in my previous emial...
Need Help To Split Paired-End Dataset
I am new to the NGS analysis. I need help to solve this problem. As shown in my previous emial...
Trimming sequences with homology parameters
Hello, I am processing single-end illumina reads coming from a PCR library. I know the flankin...
Mirna-Seq Help
Hi, I'm new to Galaxy and am trying to view several miRNA datasets as a differential expression. ...

Content

Help

About
FAQ

Access

RSS
Stats
API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by Biostar version 16.09

Traffic: 171 users visited in the last hour