Title :
Verification of supervised clustering validity and its applications
Author :
Zhan, Yan ; Yuan, Fang ; Wang, Xi Zhao
Author_Institution :
Sch. of Math. & Comput. Sci., Hebei Univ., China
Abstract :
The clustering validity problem decides whether the clustering result of data set is reasonable. The validity functions frequently used for clustering are based on unsupervised clustering, whose central meaning is the confirmation of class number in unsupervised clustering. This paper puts forward a validity function for judging clustering; the numerical experiments prove its validity. For applications in k-nearest neighbor classification, we can view the class center as its representative if the data set satisfies the qualification of this function. In this way, we can reduce the iteration query in all the training data while we only need to compare the similarities between the testing data and each class center. When selecting k-neighbors, we can only choose the nearest neighbor (1-nearest neighbor) in order to achieve precise classification and avoid the trouble of looking for the k-value. This will reduce the query complexity greatly and improve the efficiency of the nearest neighbor algorithm.
Keywords :
learning (artificial intelligence); pattern classification; pattern clustering; class center; clustering validity; data set; inter similarity; iteration query; machine learning; nearest neighbor classification; pattern recognition; unsupervised clustering; validity functions; Clustering algorithms; Entropy; Information science; Machine learning; Mathematics; Nearest neighbor searches; Partitioning algorithms; Qualifications; Testing; Training data;
Conference_Titel :
Machine Learning and Cybernetics, 2002. Proceedings. 2002 International Conference on
Print_ISBN :
0-7803-7508-4
DOI :
10.1109/ICMLC.2002.1175357