Question: Dealing with rows in tabular data
0
gravatar for d.angra
2.7 years ago by
d.angra50
United Kingdom
d.angra50 wrote:

Hello Experts

I am using use galaxy for genome wide SNP discovery. I have discovered about 770,000 SNPs. After performing operations in galaxy I have been able to get sequences for each of them.I obviously have a pool of several genotypes. At this point I have 2 questions 1) How should annotate my vcf file so that I know in the end from where does each SNP come from? 2) Are there any tools for dealing with rows in galaxy. The tools in text manipulations mostly involve columns, or I cant get hold of tools working with rows. I need to delete alternate rows which I do not require. I am certain somebody must have done it in the past. Hope to hear from you soon, I am struggling with this for few days.

Dips

ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by d.angra50
1
gravatar for Jennifer Hillman Jackson
2.7 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

For annotation, see the tools in the group NGS: Variant Analysis. From there, VCF Tools or others may be helpful, search by the term VCF to review the options.

General tools to slice data files by line are in the group Text Manipulation. For alternate rows one could add in line numbers (Add column with option: iterate) to create a tabular dataset with actual line numbers included. Then use the tool Filter or Select to pick certain rows of interest (odd, even, specific line numbers). At the end, don't forget to remove the counting column and reset the file format to VCF.

Hopefully this helps, Jen, Galaxy team

ADD COMMENTlink written 2.7 years ago by Jennifer Hillman Jackson25k
0
gravatar for d.angra
2.7 years ago by
d.angra50
United Kingdom
d.angra50 wrote:

Hello Jen Thankyou. This has been extremely helpful. Dips

ADD COMMENTlink written 2.7 years ago by d.angra50
0
gravatar for d.angra
2.7 years ago by
d.angra50
United Kingdom
d.angra50 wrote:

Hi Jen I thought I would achieve some success with your reply. However I am caught at few places. First I need to know how to iterate columns if I need to give only two numbers 1 and 2 . Second are there any short functions known to select only even number rows. I had a good look over filter and sort functions with limited understanding.Could you please direct me in the direction where I could get more help on this.

Thanks in advance Dips

ADD COMMENTlink written 2.7 years ago by d.angra50

For the line numbers, start with just "1" (no quotes).

Try a regular expression to filter on this column using the select tool. Many resources for regular expressions are on the internet. You'll have to test them out to suit your exact data. Rearrange columns if that helps (put the new column first, using the tool Cut).

ADD REPLYlink written 2.7 years ago by Jennifer Hillman Jackson25k

Hi To accomplish this I used AWK programming. Thanks

ADD REPLYlink written 2.7 years ago by d.angra50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 167 users visited in the last hour