Question: Give a list of genes and return Transcription starting sites as a bed file
1
gravatar for yewenduo
3.8 years ago by
yewenduo60
United States
yewenduo60 wrote:

Dear there:

I have a questions that may seems naive to most of you: does anyone know if there is any program where I could use a list of genes as input and the return will be a list of TSS(as Bed)? The purpose I want to do is is that I want to plot the fold change of the gene expression I obtained from array data together with my Chip-Seq data. So I can have an idea roughly where is the hot spot on the genome my TF of interest is working.

Preferably it will be great if I could use existing galaxy instance to do this job,

Thank you!

 

 

 

bioinformatics galaxy • 2.1k views
ADD COMMENTlink modified 3.8 years ago by Jennifer Hillman Jackson25k • written 3.8 years ago by yewenduo60
1
gravatar for Jennifer Hillman Jackson
3.8 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

Many data providers may host this annotation, or it can be interpreted from reference gene annotation in several formats (BED12, GTF, GFF) . iGenomes has this annotated in the GTF datasets it provides. You may need to map gene identifiers (HUGO, etc) to the transcript names native to the reference annotation used.

Some genomes at UCSC contain specific tracks for TSS annotation. These can be used as-is, or filtered/intersected with a reference annotation track containing transcript/gene coordinates based on the same reference genome. Or the gene annotation track used directly and the TSS sites extracted. All data from UCSC can be output in BED6/BED12 format, using the tool "Get Data -> UCSC Main". Intersect the data (if needed) using tools from the group "Operate on Genomic Intervals" or possibly "BED Tools". The UCSC tool above will accept a list of gene identifiers as a filter for reference transcript/gene tracks such as Refseq via the Table Browser.

Hopefully this helps, but let us know if you need more details about the method you plan to use. Include details (reference genome used, source) and some sample data from the data files being used. If complex, we may ask you to privately share a history from the public Main Galaxy instance containing the loaded data/manipulations you have tested so far.

Thanks, Jen, Galaxy team

ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by Jennifer Hillman Jackson25k

Hi, Jen:

Thank you very much for your instruction!

ADD REPLYlink written 3.8 years ago by yewenduo60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 175 users visited in the last hour