Title :
The effect of finite sample size on the holdout error probability estimator of homoscedastic multi-class Gaussian classification problems
Author :
El Ayadi, Moataz ; Plataniotis, Konstantinos N.
Author_Institution :
Edward S. Rogers Sr. Dept. of Electr. & Comput. Eng., Univ. of Toronto, Toronto, ON, Canada
Abstract :
Consider a homoscedastic multi-class Gaussian classification problem where the class mean vectors and the common covariance matrix are not known to the practitioner. Rather, they are estimated from given sample vectors available for each class. In this paper, an empirical procedure for approximating the bias of the holdout estimator of the Bayesian error probability (BEP) is presented. Synthetic experiments demonstrate the accuracy of the proposed procedure and how it can be used for guiding the practitioner about the necessary amount of data vectors required to achieve a certain level of accuracy in the BEP estimation. When applied to real world classification problems from the UCI machine learning repository, the proposed procedure was successfully used to estimate the test error probability based on the training data only. Moreover, with a reasonable degree of accuracy, the proposed procedure predicted the test BEP when the amount of the training data in increased.
Keywords :
Bayes methods; Gaussian processes; error statistics; learning (artificial intelligence); pattern classification; BEP estimation; Bayesian error probability; UCI machine learning repository; covariance matrix; error probability; finite sample size; holdout error probability estimator; homoscedastic multiclass Gaussian classification problems; training data; Accuracy; Covariance matrix; Error probability; Estimation; Mathematical model; Nickel; Training data;
Conference_Titel :
Neural Networks (IJCNN), The 2010 International Joint Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-1-4244-6916-1
DOI :
10.1109/IJCNN.2010.5596888