Can anyone point me to a good workflow to convert my ChIP-seq data
intervals to gene names. My peaks fall both within genes and within
Christopher Futtner, Ph.D.
Dept. of Surgery
Durham, NC 27710
There is no general workflow, primarily because there is no single
source of gene annotation and some formatting may be necessary. But
that part is resolved, using a single tool in "Operate on Genomic
Intervals" can most often assign gene names to query genome regions.
The first step is to obtain a dataset that has gene names assigned to
genome regions. This data should be based on the same reference genome
as your peak intervals.
Sources can vary by species and reference genome build. For most model
organisms, a source can be found in the set of linked projects under
"Get Data" and the dataset directly imported into Galaxy. For certain
genomes at UCSC, the tool "Operate on Genomic Intervals -> Profile
Annotations" is a quick way to see which UCSC Gene tracks are
associated. The RefSeq Genes track would be one example.
Once in Galaxy, use the tools in "Operate on Genomic Intervals" to
compare your peak intervals with gene intervals and link in gene
"Join" would be the simplest option. Help is located on each tool's
form, including links to the wiki and screencasts, but also directly
Hopefully this helps,