DocumentCode
910000
Title
On the mean accuracy of statistical pattern recognizers
Author
Hughes, Gordon P.
Volume
14
Issue
1
fYear
1968
fDate
1/1/1968 12:00:00 AM
Firstpage
55
Lastpage
63
Abstract
The overall mean recognition probability (mean accuracy) of a pattern classifier is calculated and numerically plotted as a function of the pattern measurement complexity n and design data set size
. Utilized is the well-known probabilistic model of a two-class, discrete-measurement pattern environment (no Gaussian or statistical independence assumptions are made). The minimum-error recognition rule (Bayes) is used, with the unknown pattern environment probabilities estimated from the data relative frequencies. In calculating the mean accuracy over all such environments, only three parameters remain in the final equation:
, and the prior probability
of either of the pattern classes. With a fixed design pattern sample, recognition accuracy can first increase as the number of measurements made on a pattern increases, but decay with measurement complexity higher than some optimum value. Graphs of the mean accuracy exhibit both an optimal and a maximum acceptable value of
for fixed
and
. A four-place tabulation of the optimum
and maximum mean accuracy values is given for equally likely classes and
ranging from
to
. The penalty exacted for the generality of the analysis is the use of the mean accuracy itself as a recognizer optimality criterion. Namely, one necessarily always has some particular recognition problem at hand whose Bayes accuracy will be higher or lower than the mean over all recognition problems having fixed
, and
.
. Utilized is the well-known probabilistic model of a two-class, discrete-measurement pattern environment (no Gaussian or statistical independence assumptions are made). The minimum-error recognition rule (Bayes) is used, with the unknown pattern environment probabilities estimated from the data relative frequencies. In calculating the mean accuracy over all such environments, only three parameters remain in the final equation:
, and the prior probability
of either of the pattern classes. With a fixed design pattern sample, recognition accuracy can first increase as the number of measurements made on a pattern increases, but decay with measurement complexity higher than some optimum value. Graphs of the mean accuracy exhibit both an optimal and a maximum acceptable value of
for fixed
and
. A four-place tabulation of the optimum
and maximum mean accuracy values is given for equally likely classes and
ranging from
to
. The penalty exacted for the generality of the analysis is the use of the mean accuracy itself as a recognizer optimality criterion. Namely, one necessarily always has some particular recognition problem at hand whose Bayes accuracy will be higher or lower than the mean over all recognition problems having fixed
, and
.Keywords
Pattern classification; Accuracy; Equations; Information theory; Pattern recognition; Probability; Size measurement;
fLanguage
English
Journal_Title
Information Theory, IEEE Transactions on
Publisher
ieee
ISSN
0018-9448
Type
jour
DOI
10.1109/TIT.1968.1054102
Filename
1054102
Link To Document