DocumentCode
2040485
Title
Classification performance of various real-life data sets when the features are discretized
Author
Lynch, Robert S., Jr. ; Willett, Peter K.
Author_Institution
Signal Process. Branch, Naval Undersea Warfare Center, Newport, RI, USA
Volume
2
fYear
2001
fDate
2001
Firstpage
753
Abstract
The Bayesian data reduction algorithm is applied to a collection of thirty real-life data sets primarily found at the University of California at Irvine´s Repository of Machine Learning databases. The algorithm works by finding the best performing quantization complexity of the feature vectors, and this makes it necessary to discretize all continuous valued features. Therefore, results are given by showing the initial quantization of the continuous valued features that yields best performance. Further, the Bayesian data reduction algorithm is also compared to a conventional linear classifier, which does not discretize any feature values. In general, the Bayesian data reduction algorithm outperforms the linear classifier by obtaining a lower probability of error, as averaged over all thirty data sets
Keywords
Bayes methods; data reduction; error statistics; feature extraction; learning (artificial intelligence); pattern classification; vector quantisation; Bayesian data reduction algorithm; Repository of Machine Learning databases; classification performance; continuous valued features; discretization; feature vectors; linear classifier; probability error; quantization complexity; real-life data sets; Bayesian methods; Error probability; Machine learning; Machine learning algorithms; Military computing; Quantization; Signal processing algorithms; Spatial databases; Testing; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Systems, Man, and Cybernetics, 2001 IEEE International Conference on
Conference_Location
Tucson, AZ
ISSN
1062-922X
Print_ISBN
0-7803-7087-2
Type
conf
DOI
10.1109/ICSMC.2001.973005
Filename
973005
Link To Document