Question: Fwd: [Galaxy-Lab] Question About Branch Lengths Estimation
8.2 years ago by
Guruprasad Ananda230 wrote:
Hi Melissa, So looks like you'll have to use 'N' as a masking character instead of #. You can either rerun quality masking on your alignments or do the following to convert your #s to Ns in your masked fasta files. 1. Convert fasta to tabular 2. Use 'Text manipulation -> Compute' tool on the tabular file from step(1) to convert #s to Ns, using the following expression: c2.replace(chr(35),"N") 3. Convert output of step(2) to Fasta using 'Convert formats -> Tabular-to-FASTA' tool with c1 as title column and c3 as sequence column. Thanks, Guru. Begin forwarded message:
8.2 years ago by
Thanks for looking into this, Guru :) The masking tool is supposed to mask all columns of the alignment anywhere one of them has a quality less than score XX. That means that all alignments *should* be the same length, even after # symbols are ignored in HyPhy. Thus, there shouldn't be a problem with using # as a masking symbol rather than N. I will attempt changing the # characters to N, but wanted to mention that the solutions you sent don't address the possibility that the sequence lengths might be different, as a result of the masking tool. I'll let you know how it goes. Thanks, Melissa -- Melissa A. Wilson Sayres NSF Graduate Research Fellow, Bioinformatics & Genomics 306 Wartik Lab University Park, PA 16802 It is far better to grasp the Universe as it really is than to persist in delusion, however satisfying and reassuring. -- Carl Sagan
