I used TopHat to map RNA-Seq reads to genomes. In the output (.sam)
file, the value of some mapping quality (the 5th column) is 255. What
does it mean? And I found these reads which have mapping quality 255
mapped to unique place.
This is not strictly correct. Tophat/bowtie don't report mapping
values that are as meaningful as BWA, but there is some information in
mapping quality values tophat reports. Tophat yields 4 distinct values
its mapping quality values (you can do a "unique" count on the mapping
quality field of any SAM file from tophat to verify this):
255 = unique mapping
3 = maps to 2 locations in the target
2 = maps to 3 locations
1 = maps to 4-9 locations
0 = maps to 10 or more locations.
Except for the 255 case, the simple rule that was encoded by the
the usual phred quality scale:
MapQ = -10 log10(P)
Where P = probability that this mapping is NOT the correct one. The
ignore the number of mismatches in this calculation and simply assume
if it maps to 2 locations then P = 0.5, 3 locations implies P = 2/3, 4
locations => P = 3/4 etc.
As you can clearly see, then MapQ = -10 log10(0.5) = 3; -10 log10(2/3)
1.76 (rounds to 2);
-10 log10(3/4) = 1.25 (rounds to 1), etc.
Date: Tue, 7 Feb 2012 17:56:34 -0500
To: "Li, Jilong (MU-Student)" <email@example.com>
Cc: "firstname.lastname@example.org" <email@example.com>
Subject: Re: [galaxy-user] about Mapping Quality
Content-Type: text/plain; charset="us-ascii"
Tophat/Bowtie does not yield mapping quality, so, as per the SAM spec,
field is set to 255, indicating that quality is unavailable.
the value of some mapping quality (the 5th column) is 255. What does
mean? And I found these reads which have mapping quality 255 mapped to