Question: Bed To Gff
0
gravatar for Keith Giles
7.2 years ago by
Keith Giles50
Keith Giles50 wrote:
I am trying to use the galaxy "BED to GFF" function. The operation worked, but instead of giving me back any feature information (e.g., exon, intron, repeat, etc.); I just received back the sequence of the interval contained within the BED file. Does anyone know what I'm doing wrong? Moreover, does anyone know the best way to map each read of a RNAseq run to a given feature?
gff • 1.1k views
ADD COMMENTlink modified 7.2 years ago by Jennifer Hillman Jackson25k • written 7.2 years ago by Keith Giles50
0
gravatar for Jennifer Hillman Jackson
7.2 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hi Keith, Are you using a full BED12 file? Or just a BED3-6? Full BED12 should return the available features: 3. feature - The name of this type of feature. Some examples of standard feature types are "CDS", "start_codon", "stop_codon", and "exon". If you would like to share a history, that would help if this is not enough information ("Options -> Share or Publish). You can send the link to me directly. Best, Jen Galaxy team -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org
ADD COMMENTlink written 7.2 years ago by Jennifer Hillman Jackson25k
Hi Keith, Good questions - hopefully this info can help: To get from BED3 into BED12, use the BED3 as a filter in the UCSC Table Browser against a gene track (UCSC Genes, RefSeq Genes, etc.) and send the output to Galaxy. Or better, use a BED6 so that you can include strand in column 6, just enter NULL values for name (".", column 4) and score ("0", column 5) to pad the file format out correctly so that the UCSC Table Browser can interpret it. Interval is a Galaxy file type, with the UCSC Browser, the BED format must be intact and to spec. BED format is defined on the BED-to-GFF tool help (scroll down). If BED12, the features listed are interpreted from the format. If you want repeat information and such, then perhaps a tool like "Operate on Genomic Intervals -> Profile Annotations" would be a good choice. From the results, you could determine which ancillary tracks to pull over into Galaxy from the UCSC Table browser (in GTF or BED format). There are choices here (multiple repeat tracks, for example). Please note this tool is set up for human annotation currently. When running a query in the Table browser for certain data, the way the internal query is structured will pull out as a result every entry in the track with any coverage, complete (i.e. not limited to the original BED/Coordinate filters). BED3 would be necessary to pull in data contained in introns-only, although a BED6 that included strand might be a better choice for some tracks (those that are stranded). Don't use a BED12 if you want information about the entire region (transcribed & other). These "any coverage" results from UCSC can be trimmed down in Galaxy using tools in "Operate on Genomic Intervals" and "Join, Subtract and/or Group" (depends on the data). The process would be step-by-step the first time, but can be easily saved into a workflow to use again without having to re-do it each time around. If you would like more help, just let us know, Jen Galaxy team -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org
ADD REPLYlink written 7.2 years ago by Jennifer Hillman Jackson25k
0
gravatar for Keith E. Giles
7.2 years ago by
United States
Keith E. Giles170 wrote:
I am trying to use the galaxy "BED to GFF" function. The operation worked, but instead of giving me back any feature information (e.g., exon, intron, repeat, etc.); I just received back the sequence of the interval contained within the BED file. Does anyone know what I'm doing wrong? Moreover, does anyone know the best way to map each read of a RNAseq run to a given feature?
ADD COMMENTlink written 7.2 years ago by Keith E. Giles170
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 125 users visited in the last hour