Question: Get Flanks (Version 1.0.0)
0
gravatar for Fabricia Nascimento
6.6 years ago by
Fabricia Nascimento20 wrote:
HI, I am very new to genomic data analysis and I need to get some upstream and downstream of some chromosome regions of the pig genome. I have about 70 blat hits of a query of ca 100aa. I need to get 7000 nucleotides both upstream and downstream of this 100aa region. I have tried to use Get flanks to get the "new" coordinates... bus instead of generating coordinates which would correspond to about 14000 nucleotides, it generates one coordinate for the upstream region and them another one for the downstream region. Is there a way of doing what I need using Galaxy?  I would appreciate any help! Thanks a lot! All the best, Fabricia.
galaxy • 1.5k views
ADD COMMENTlink modified 6.6 years ago by Jennifer Hillman Jackson25k • written 6.6 years ago by Fabricia Nascimento20
0
gravatar for Jennifer Hillman Jackson
6.6 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hello Fabricia, You are probably running the tool like this, correct? This lumps the upstream flank and downstream flank ends to create one interval: "Region:" Whole feature "Location of the flanking region/s:" Both "Offset" 0 "Length of the flanking region(s):" 7000 Instead, run the tool in twice to extract upstream and downstream regions into distinct intervals: Run 1 "Region:" Whole feature "Location of the flanking region/s:" Upstream "Offset" 0 "Length of the flanking region(s):" 7000 Run 2 "Region:" Whole feature "Location of the flanking region/s:" Downstream "Offset" 0 "Length of the flanking region(s):" 7000 If your question has been misunderstood, please let us know, Best, Jen Galaxy team -- Jennifer Jackson http://galaxyproject.org
ADD COMMENTlink written 6.6 years ago by Jennifer Hillman Jackson25k
Hi Jen, Thanks a lot for your reply. But I think you misuderstood my question. I will reformulate it given examples. I have initially (because I am doing just preliminary analysis) 70 blat hits corresponding to different coordinates in the pig genome. What I would like to have is the flanking region in both direction between these blat hits. I am not working with gene (or introns and exons). For example: Imagine that this symbol ########## corrsponds to my blat hit   and this symbol -------------------- corresponds to flanking regions I have initially ########## and I would like to obtain  -------------------- ########## -------------------- In numbers: I have: chr146496908464969603 I would like to have chr146496208464976603 (This will correspont to the first coorditane minus 7000 and the last coordinate plus 7000) What I got using "Get Flanks" and using the parameters "Region:" Whole feature "Location of the flanking region/s:" Both "Offset" 0 "Length of the flanking region(s):" 7000 chr14 64962084 64969084 chr14 64969603 64976603 Is there a way of merging the above coorditates to come with what I need? Thanks a lot for your help, All the best, Fabricia. ________________________________ De: Jennifer Jackson <jen@bx.psu.edu> Para: Fabricia Nascimento <nascimentoff@yahoo.com.br> Cc: "galaxy-user@lists.bx.psu.edu" <galaxy-user@lists.bx.psu.edu> Enviadas: Quinta-feira, 3 de Maio de 2012 4:25 Assunto: Re: [galaxy-user] Get flanks (version 1.0.0) Hello Fabricia, You are probably running the tool like this, correct? This lumps the upstream flank and downstream flank ends to create one interval: "Region:" Whole feature "Location of the flanking region/s:" Both "Offset" 0 "Length of the flanking region(s):" 7000 Instead, run the tool in twice to extract upstream and downstream regions into distinct intervals: Run 1 "Region:" Whole feature "Location of the flanking region/s:" Upstream "Offset" 0 "Length of the flanking region(s):" 7000 Run 2 "Region:" Whole feature "Location of the flanking region/s:" Downstream "Offset" 0 "Length of the flanking region(s):" 7000 If your question has been misunderstood, please let us know, Best, Jen Galaxy team -- Jennifer Jackson http://galaxyproject.org
ADD REPLYlink written 6.6 years ago by Fabricia Nascimento20
Hi Fabricia, To create a merged interval that spans the 7k upstream flank interval, the original interval, and the 7k downstream flank interval, do the following: Starting with the two files you already have: 1 - original intervals (extracted from blat hits) 2 - flank results from the query: "Get Flanks" "Region:" Whole feature "Location of the flanking region/s:" Both "Offset" 0 "Length of the flanking region(s):" 7000 Put both datasets into a single dataset using the tool: "Operate on Genomic Intervals -> Concatenate", Both datasets are same filetype?: checked. On that result file, Merge the intervals together using the tool: "Operate on Genomic Intervals -> Merge". If your original blat hits have any overlap, or the flanks your are generating have any overlap with any of your other intervals (original or other flanks), then this is probably not going to give you the results you want. In that case, it may just be simpler to just modify the coordinates using "Text manipulation" tools. Specifically, "Compute an expression on every row", run twice, once with the expression "c2 - 7000" and once with "c3 + 7000" (this is subtracting 7000 from the start, adding 7000 to the end). Then use "Cut" to recreate the interval file using the new values as start and end. Hopefully one of these will work for you. Jen Galaxy team -- Jennifer Jackson http://galaxyproject.org
ADD REPLYlink written 6.6 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 171 users visited in the last hour