Question: Extract an individual and associated sequence data from a .vcf file
I would appreciate any advice that I can get.  I have a .vcf file with sequence data for over 200 individuals.  I am interested in only one of the individuals and would like to extract their information from the file for use in snpEff. Can I use galaxy to do this?  I have tried using vcftools, but I can't get it to work.  I should add that while I can do some basic command line stuff, I am still fairly computer illiterate.

VCF Tools are the best place to start. I would perhaps double check that the expected filter criteria actually match the data in your file. The match really does have to be exact. If still stuck, please share an example line that you want to find, the attribute you think you are looking for, and the exact tool used along with parameters. We can help more to help tune the search/filter.


The tool "Filter" can be used just about like any text editor "find" function. It may take a few tries to discover the best keyword(s) to use for how your data is labeled. 

If really feeling up to more, could also use "Select". The logic for the regular expressions is described on the page. There are endless examples online, and everyone has to give these a test (or twenty) for the more tedious searches. Still, an option if you wanted to try.

Hope one of these works out for you, Jen, Galaxy team

Thank you! I have finally got vcftools to work and I think the end is in sight!


