DocumentCode :
2040485
Title :
Classification performance of various real-life data sets when the features are discretized
Author :
Lynch, Robert S., Jr. ; Willett, Peter K.
Author_Institution :
Signal Process. Branch, Naval Undersea Warfare Center, Newport, RI, USA
Volume :
2
fYear :
2001
fDate :
2001
Firstpage :
753
Abstract :
The Bayesian data reduction algorithm is applied to a collection of thirty real-life data sets primarily found at the University of California at Irvine´s Repository of Machine Learning databases. The algorithm works by finding the best performing quantization complexity of the feature vectors, and this makes it necessary to discretize all continuous valued features. Therefore, results are given by showing the initial quantization of the continuous valued features that yields best performance. Further, the Bayesian data reduction algorithm is also compared to a conventional linear classifier, which does not discretize any feature values. In general, the Bayesian data reduction algorithm outperforms the linear classifier by obtaining a lower probability of error, as averaged over all thirty data sets
Keywords :
Bayes methods; data reduction; error statistics; feature extraction; learning (artificial intelligence); pattern classification; vector quantisation; Bayesian data reduction algorithm; Repository of Machine Learning databases; classification performance; continuous valued features; discretization; feature vectors; linear classifier; probability error; quantization complexity; real-life data sets; Bayesian methods; Error probability; Machine learning; Machine learning algorithms; Military computing; Quantization; Signal processing algorithms; Spatial databases; Testing; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man, and Cybernetics, 2001 IEEE International Conference on
Conference_Location :
Tucson, AZ
ISSN :
1062-922X
Print_ISBN :
0-7803-7087-2
Type :
conf
DOI :
10.1109/ICSMC.2001.973005
Filename :
973005
Link To Document :
بازگشت