Isolating core sequences ("summits") of BED or FASTA files

Heads up! This is a static archive of our support site. Please go to help.galaxyproject.org if you want to reach the Galaxy community. If you want to search this archive visit the Galaxy Hub search

Latest

Open

RNA-Seq

ChIP-Seq

SNP

Assembly

Forum

Home

Welcome to Galaxy Biostar! User support for Galaxy! about • faq • rss

Log In

Sign Up

Question: Isolating core sequences ("summits") of BED or FASTA files

0

3.3 years ago by

murtaugh • 0

United States

murtaugh • 0 wrote:

This seems like it should be self-explanatory, but I can't figure it out - apologies in advance. I'm playing with ChIP-seq files, using MACs to identify peaks, and of course the peaks that come out in the resulting BED file are of variable length. I'd like to just see if I can identify motifs within the peak "summits" corresponding to the IP'd transcription factor (in this case, CTCF), but all the motif-finding software I can locate require FASTA-formatted sequences of identical size, or else at least relatively limited size. I have been able to pull out the genomic sequences corresponding to my BED file, into a FASTA file, but again each sequence is a different size, some quite large.

I can't seem to find any tool in Galaxy that will allow me to capture, for example, only the central 100 nucleotides of each line of a FASTA file, or similarly reduce the coordinates within a BED file down to that length around the center. I was able to do this "by hand" in Excel, using the XLS output of MACS and the "summit" position called for each peak, but it seems like there must be a tool to perform the same operation within Galaxy.

Assuming this is a soluble problem within Galaxy itself, I might go further and ask if there are good motif discovery tools within Galaxy.

Thanks a lot!

bed macs • 1.0k views

ADD COMMENT • link •

written 3.3 years ago by murtaugh • 0

Please log in to add an answer.

Similar posts • Search »

Motif Search from Bed file
Hi all, I have recently generated a list of motifs of TF binding sites within peaks of my ChIPse...
Get fasta file from peak called BED file to do Motif analysis
Hi all, I have some called peaks in BED format and would like to extract the sequences in a FAST...
Galaxy 101 Tutorial BED File Output
To whom It May Concern, I am a new user and have no prior bioinformatics knowledge. I must be o...
Text Editing
Hello Luce, I can explain the use of the tools "Text Manipulation". For each file independently,...
Peak Overlap Analysis, Allowing Space Between "Overlapping" Peaks
Hello - I have been trying to find a solution to identify overlapping peaks between two ChIP-Seq...
Equicktandem Search Not Yielding Any Results
Hello all, I am certainly new to using galaxy and I have already checked the message boards to g...
Setting Dbkey In A Workflow
I'm working on making my first workflow in Galaxy, using a local server. A high level overview of...
MACS fails, finds peaks but can't build model
Hi, I'm using the Galaxy GVL-QLD instance. I have single reads chip-seq files from E. coli...
Identifying Tags - Galaxy Question
Hello, I need to perform an action (or series of actions) on an 454 dataset using Galaxy, and ha...
FIMO motif search
Hi all, I would like to use FIMO motif search tool provided by usegalaxy.org to scan a list of ...
How to find out that Chip-seq peaks falls within promoter or enhancer region of gene?
Hello, I have got the peaks and their corresponding peak regions from MACS2 peak analysis tools...
Chip-Seq Data Analysis Question
Hello, My name is Christopher Terranova and am a M.S student at the University of Buffalo SUNY.I...
Peak-Calling With Macs From .Bowtie File
Hi, I have ChIP-seq alignment files in .bowtie format and would like to perform peak-calling usi...
Help finding NGS analysis/visualization tools for immunoglobulin repetoires
Hi all, I'm a beginner to bioinformatics, and I'm having trouble finding specific tools that co...
Macs Out Put Files From Galaxy
I ran MACS on my chipseq dataset and found various files: 1. under html report there ar etwo fil...
normalisation between different ChIP-seq data
Hi there, I guess it is a recurrent problem, but I haven't found a satisfying answer yet. I have ...
Handling Large Files In Galaxy
Hi all; I've recently gotten a local Galaxy install up and running for our group. We do a lot of ...
Genome name in Bam file, IGV visualization problem
I am using IGV for Mac (2.3.68) to visualize the alignments I made in Galaxy against E.coli MG165...

Content

Help

About
FAQ

Access

RSS
Stats
API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by Biostar version 16.09

Traffic: 175 users visited in the last hour