I have to import into Galaxy annotations of coding genes from the UCSC genome browser and then use the "Operate on genomic intervals" set of tools to remove transcripts that overlap with coding genes from my gtf file . How should I go about doing it?. Thanks, Himanshu
Hi Himanshu, The first protocol of this paper has a section where protien coding exons are extracted from UCSC. This could be adjusted to include the entire genome (region = genome) and can be from any "Gene and Gene Prediction Track" you find appropriate (read track methods at UCSC to understand contents). There is a video walk-through for this one. The fourth protocol explores all of the GOPS tools in detail with examples. You will probably want to convert GTF to interval before using these tools, then try a tool like "Coverage of a set of intervals on second set of intervals", then do the data reduction. No video for protocol 4, see the tool form's themselves for help and links to expanded help. When doing a comparison between two interval files that represent exons, you will probably need to be creative with dataset manipulation tools to exclude all lines related to transcripts that have any overlap. Tools in 'Join, Subtract, and Group' and 'Filter and Sort' will be helpful. Vimeo Channel "CPB Using Galaxy": Hopefully this helps get you started! Jen Galaxy team -- Jennifer Hillman-Jackson
