Question: SNP Check for homozygous and heterozygous
0
gravatar for enginyol
23 months ago by
enginyol0
enginyol0 wrote:

I have 256 Recombinant Inbred Lines (RILS) in F7 generation. I have VCF files for RILS, mother and father. I would like to filter SNP's which are heterozygous with each parents. How can I do in Galaxy. Thank you

snp galaxy • 1.3k views
ADD COMMENTlink modified 23 months ago by Guy Reeves1.0k • written 23 months ago by enginyol0
1
gravatar for Guy Reeves
23 months ago by
Guy Reeves1.0k
Germany
Guy Reeves1.0k wrote:

I often want do do the same thing but I slightly change the question to be ' I want to remove all sites where both parents are probable heterozygotes'

It could be that parent A is called a confident 0/1 but parent B is a low quality 0/0 ( meaning that it could well be 0/1).

I use SNPsift which has very good documentation available . The program is installed on usegalaxy.org (though I am not sure if it is the latest version).

I use a command like this in SNpsift to filter the VCF

! ((GEN[5].GL[1] > -4 ) & ( GEN[6].GL[1] > -4 ))

I think you can figure out what it does from the documentation. But these are some notes ! = not ( get rid of sites which match from output .vcf) & = in this case to only if both statements are true will they be got rid of i.e. both parents are potential heterozygotes. GEN[5]. = one parent is the 6th sample in the VCF ( the first sample is 0 the second is 1...). the other parent is the 7th in the .vcf file. GL[1] = in the genotype likelihood (GL) field the 2nd number is the likelihood of being heterozygous (the first is numbered 0 and is the GL 0/0, GL : 0, 1, 2:) '>-4 ' = if the value of GL likelihood being heterozygous is greater that this I consider it a questionable heterozygote, this is my rule of thumb, it will also depend on how your GL is scaled (mine is from using FreeBayes). So need to establish this yourself. other programs may use GP instead of GL

To test get rid of the ! and you should generate a file with sites where both parents are potential heterozygotes. Hope this helps

Guy

ADD COMMENTlink modified 23 months ago • written 23 months ago by Guy Reeves1.0k
1

since writing this I looked on Useglaxy.org and see that SNPsift is no longer there, which is a shame as it is a really useful program. I know in the distant past (https://biostar.usegalaxy.org/p/14003/ )there was an issue with it not 'behaving' in the past as the documentation indicated, but this was due to an old version being on usegalaxy. I have a current version on my galaxy instance and it works great. I think it should be put back on Usegalaxy.org ! Cheers

ADD REPLYlink modified 23 months ago • written 23 months ago by Guy Reeves1.0k

you may be able to use 'VCFfilter' tool which is on usegalaxy.org but I am not convinced

ADD REPLYlink written 23 months ago by Guy Reeves1.0k

Thank you for sharing. I heard the SNPsfit however there is no loaded now.

ADD REPLYlink written 23 months ago by enginyol0
0
gravatar for Jennifer Hillman Jackson
23 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

Please see these resources:

Thanks! Jen, Galaxy team

ADD COMMENTlink written 23 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 172 users visited in the last hour