Question: Difficulty with aaChanges
0
gravatar for rebeccajstoll
3.3 years ago by
United States
rebeccajstoll0 wrote:

Hi!

We have a series of mutations that are referenced to HG19. We would like to know how these mutations change the HG19 Amino Acids. We would like to have a resulting file with both the original and the new amino acid created by the mutation.

We are starting with files that have 1000s of mutations and would like to process these files all at once.

Is there a way to use the "aaChanges" function to accomplish this? We tried to do this but the aaChanges will apparently only return known SNP aa changes (as opposed to aa changes caused by novel mutations).

Thank you!

I am using this as my input - but it contains mutations and SNPs. But my output only has the SNP positions in it.

chr2 179395067 179395068 C/G
chr2 179440029 179440030 G/A
chr2 179485919 179485920 G/A
chr2 179506790 179506791 A/T
chr2 179528807 179528808 T/A
chr2 179535087 179535088 T/C
chr2 179536884 179536885 C/T
chr2 179543705 179543706 G/A
chr2 179554305 179554306 C/T
chr2 179566038 179566039 C/T
chr2 179566051 179566052 C/T
chr2 179611205 179611206 G/T
chr2 179635605 179635606 A/G
chr2 179641951 179641952 G/A
chr2 179643566 179643567 A/G
chr2 179650701 179650702 C/T
chr2 179666830 179666831 G/A
chr2 179667090 179667091 C/T

ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by rebeccajstoll0

Do you have the coordinates of the SNP?

ADD REPLYlink written 3.3 years ago by Bjoern Gruening5.1k

Hi! Thank you for your quick response!

Yes I am using this as my input - but it contains mutations and SNPs. But my output only has the SNP positions in it.

chr2 179395067 179395068 C/G
chr2 179440029 179440030 G/A
chr2 179485919 179485920 G/A
chr2 179506790 179506791 A/T
chr2 179528807 179528808 T/A
chr2 179535087 179535088 T/C
chr2 179536884 179536885 C/T
chr2 179543705 179543706 G/A
chr2 179554305 179554306 C/T
chr2 179566038 179566039 C/T
chr2 179566051 179566052 C/T
chr2 179611205 179611206 G/T
chr2 179635605 179635606 A/G
chr2 179641951 179641952 G/A
chr2 179643566 179643567 A/G
chr2 179650701 179650702 C/T
chr2 179666830 179666831 G/A
chr2 179667090 179667091 C/T

 

 

ADD REPLYlink written 3.3 years ago by rebeccajstoll0

Can you extend your original question with this information. Thanks.

ADD REPLYlink written 3.3 years ago by Bjoern Gruening5.1k
1
gravatar for Bjoern Gruening
3.3 years ago by
Bjoern Gruening5.1k
Germany
Bjoern Gruening5.1k wrote:

Hi,

ok my first thought would be to filter them by an annotated gene list. Go to UCSC and download all gene coordinates from hg19. Intersect your list with the list of all known genes. You should now have one list with all genes. This one you can convert to sequences (Extract Genomic DNA using coordinates from assembled/unassembled genomes). You should end up with a FASTA file of your genes, which can be converted to AS sequence.

The only step we need to figure out is how to alter the gene sequence given a BED file. If this is possible you can generate a second FASTA file of mutated sequences and compare both.

Crying for help @Jen!!!!

An other solution would be to convert your file into VCF and annotated it with SNPEff. SNP will tell you if your mutation causes a alteration in the AS code.

Does this get you started?

Bjoern

ADD COMMENTlink written 3.3 years ago by Bjoern Gruening5.1k

Thank you! This definitely gets me started! I will share this information with my more experienced colleagues and let you know where that leaves us!

ADD REPLYlink written 3.3 years ago by rebeccajstoll0

Hello again Bjoern!

Your post has been helpful and I am definitely on the right track! I am very new to all of this so I just wanted to double check something. When you say intersect the lists should I just copy the positions I am looking for into my excel file with all gene coordinates from Hg19?

Thank you again!

ADD REPLYlink written 3.3 years ago by rebeccajstoll0
1

Oh no ... Galaxy should replace Excel ;) As soon as you see yourself doing tedious, repetitive work you are doing something wrong or Galaxy is missing a feature :)

Galaxy has several intersect tools that are working on coordinates. You will simply get the intersection of two files with coordinates. This mean the regions that are overlapping.

ADD REPLYlink written 3.3 years ago by Bjoern Gruening5.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 172 users visited in the last hour