Question: Chromosome names and position names in VCF file
0
gravatar for umermehar10
12 months ago by
umermehar1010
umermehar1010 wrote:

Dear All I have a vcf file generated through happlotype caller in gatk. The chromosome and position names are very different in it like gi|996703411|ref|NW_015379183.1| Kindly somebody help me in this regard how to fix it or replace with normal positions and chromosome names. Its a rice genome data aligned with IRGSP 1.0 reference genome.

vcf • 852 views
ADD COMMENTlink modified 12 months ago by Jennifer Hillman Jackson25k • written 12 months ago by umermehar1010
0
gravatar for Jennifer Hillman Jackson
12 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

All GATK tool wrappers, whether from the Tool shed installed in a local, or hosted on a public Galaxy server, are considered deprecated.

That said, the issue has to do with a mismatch between the inputs. The exact same reference genome must be used throughout the analysis. The format of the Custom genome is very important. The identifiers must be the same between all inputs (genome, annotation, mapping results, etc) and - especially for GATK - in a specific order. It is much easier to format the CG fasta correctly from the start so it can be used during the mapping and later steps without needing to "fix" anything.

You probably want this type of header to match other data inputs:

>NW_015379183.1

Instead of this:

>gi|996703411|ref|NW_015379183.1|

Help:

Thanks, Jen, Galaxy team

ADD COMMENTlink written 12 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 177 users visited in the last hour