Question: Get Only Repeatmasked Exons
gravatar for Managadze, David (NIH/NLM/NCBI) [F]
7.1 years ago by
Dear Galaxy expert(s), I have .BED file of regions from mouse. I guess many of them can span whole genes i.e. many exons; might even span over the gene flanks. I need to get the REPEATMASKED sequences of only the annotated exons of these regions. I see that If I use the tool "Fetch Sequences->Extract Genomic DNA" on these regions, it returns sequences with mixed small and capital letters. Question I: what are the small letters and what are the capitals here? Are these already masked, exons/introns or what? (I downloaded some of these sequences and repeatmasked myself. My pasked sequences overlap with some of "yours" written in small letters.) Question II: Is the strand "honored" by these tool? I guess I remember from my old experience that there was an issue although I can not recall what exactly. Thank you in advance, David
galaxy • 796 views
ADD COMMENTlink modified 7.1 years ago by Anton Nekrutenko1.7k • written 7.1 years ago by Managadze, David (NIH/NLM/NCBI) [F]10
gravatar for Anton Nekrutenko
7.1 years ago by
Penn State
Anton Nekrutenko1.7k wrote:
David: In case of mouse the sequences are extracted from softmasked genomic builds retrieved from UCSC. So, small lettres = repeats, capital letters = no repeats. Yes, if the strand is explicitly specified. If it is not specified it is assumed to be +. Thanks for using Galaxy. anton
ADD COMMENTlink written 7.1 years ago by Anton Nekrutenko1.7k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 180 users visited in the last hour