Question: Megablast
0
gravatar for Scott Tighe
6.8 years ago by
Scott Tighe200
Scott Tighe200 wrote:
Hi Galaxy users When Magablasting 1)....what does the "identity value -p" mean ...is it percent identity? I want my megablast results to be reported form only a 100% match. I do not see a place for % alinement concordance. 2) form my Illumina Hiseq reads, are the adaptor sequences filtered during the filter step? Scott tighe --2 Scott Tighe Advanced Genome Technology Lab Vermont Cancer Center at the University of Vermont 149 Beaumont Avenue Health Science Research Bd RM 305 Burlington Vermont USA 05405 lab 802-656-AGTC (2482) cell 802-999-6666
galaxy • 913 views
ADD COMMENTlink modified 6.8 years ago by Jennifer Hillman Jackson25k • written 6.8 years ago by Scott Tighe200
0
gravatar for Jennifer Hillman Jackson
6.8 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hello Scott, For #1, option "-p": Here is a link to some megablast parameter documentation online: http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/megablast.html#3 (the primary paper for the Galaxy tool is noted at the bottom of the tool form, but this is convenient) Quote: Table 3.30 Parameter -p Function Specifies the percentage identity cut-off Default 0 Input format [Real] Example To set percent id cutoff to 75%, use: -p 75 Note: The input value range is between 0 and 100, with 0 meaning no cutoff. It only works on the aligned region or individual HSPs. For #2, there are a few ways to interpret filter. If you mean will megablast consider the adapter part of the sequence in calculations, the answer is that it does for some and doesn't for others. The part of the sequence that is adapter wouldn't align to the genome, and percent identity is only based on HSPs (high scoring pairs - one part of the pair is the DNA query and the other is the genome target, for that alignment region only). So, adapter sequence wouldn't be involved in percent identify calculations (or be expected to!). But, these unaligned regions could become a problem if coverage or certain other statistics were part of your analysis. Learning about the statistics you choose to use, to see if query length is part of the calculation, will let you know if clipping is necessary. If important, removing adapters can be done with tools in "NGS: QC and manipulation" (perform a tool search on keywords "trim" or "clip". Best, Jen Galaxy team -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support
ADD COMMENTlink written 6.8 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 171 users visited in the last hour