Give a list of genes and return Transcription starting sites as a bed file

Question: Give a list of genes and return Transcription starting sites as a bed file

3.8 years ago by

United States

yewenduo • 60 wrote:

Dear there:

I have a questions that may seems naive to most of you: does anyone know if there is any program where I could use a list of genes as input and the return will be a list of TSS(as Bed)? The purpose I want to do is is that I want to plot the fold change of the gene expression I obtained from array data together with my Chip-Seq data. So I can have an idea roughly where is the hot spot on the genome my TF of interest is working.

Preferably it will be great if I could use existing galaxy instance to do this job,

Thank you!

bioinformatics galaxy • 2.1k views

ADD COMMENT • link •

modified 3.8 years ago by Jennifer Hillman Jackson ♦ 25k • written 3.8 years ago by yewenduo • 60

3.8 years ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hello,

Many data providers may host this annotation, or it can be interpreted from reference gene annotation in several formats (BED12, GTF, GFF) . iGenomes has this annotated in the GTF datasets it provides. You may need to map gene identifiers (HUGO, etc) to the transcript names native to the reference annotation used.

Some genomes at UCSC contain specific tracks for TSS annotation. These can be used as-is, or filtered/intersected with a reference annotation track containing transcript/gene coordinates based on the same reference genome. Or the gene annotation track used directly and the TSS sites extracted. All data from UCSC can be output in BED6/BED12 format, using the tool "Get Data -> UCSC Main". Intersect the data (if needed) using tools from the group "Operate on Genomic Intervals" or possibly "BED Tools". The UCSC tool above will accept a list of gene identifiers as a filter for reference transcript/gene tracks such as Refseq via the Table Browser.

Hopefully this helps, but let us know if you need more details about the method you plan to use. Include details (reference genome used, source) and some sample data from the data files being used. If complex, we may ask you to privately share a history from the public Main Galaxy instance containing the loaded data/manipulations you have tested so far.

Thanks, Jen, Galaxy team

ADD COMMENT • link modified 3.8 years ago • written 3.8 years ago by Jennifer Hillman Jackson ♦ 25k

Hi, Jen:

Thank you very much for your instruction!

ADD REPLY • link written 3.8 years ago by yewenduo • 60

Similar posts • Search »