DocumentCode :
1828580
Title :
Pairwise Clustering by Minimizing the Error of Unsupervised Nearest Neighbor Classification
Author :
Yingzhen Yang ; Xinqi Chu ; Huang, Thomas S.
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
Volume :
2
fYear :
2013
fDate :
4-7 Dec. 2013
Firstpage :
182
Lastpage :
187
Abstract :
Pair wise clustering methods, including the popular graph cut based approaches such as normalized cut, partition the data space into clusters by the pair wise affinity between data points. The success of pair wise clustering largely depends on the pair wise affinity function defined over data points coming from different clusters. Interpreting the pair wise affinity in a probabilistic framework, we build the relationship between pair wise clustering and unsupervised classification by learning the soft Nearest Neighbor (NN) classifier from unlabeled data, and search for the optimal partition of the data points by minimizing the generalization error of the learned classifier associated with the data partitions. Modeling the underlying distribution of the data by non-parametric kernel density estimation, the asymptotic generalization error of the unsupervised soft NN classification involves only the pair wise affinity between data points. Moreover, such error rate reduces to the well-known kernel form of graph cut in case of uniform data distribution, which provides another understanding of the kernel similarity used in Laplacian Eigenmaps [1] which also assumes uniform distribution. By minimizing the generalization error bound, we propose a new clustering algorithm. Our algorithm efficiently partition the data by inference in a pair wise MRF model. Experimental results demonstrate the effectiveness of our method.
Keywords :
generalisation (artificial intelligence); graph theory; inference mechanisms; minimisation; nonparametric statistics; pattern classification; pattern clustering; statistical distributions; unsupervised learning; Laplacian eigenmaps; asymptotic generalization error minimization; data points partition; data space partition; error rate; graph cut based approach; inference mechanism; kernel similarity; learning; nonparametric kernel density estimation; pairwise MRF model; pairwise affinity function; pairwise clustering method; uniform data distribution modeling; unsupervised nearest neighbor classification; unsupervised soft NN classification; Bandwidth; Clustering algorithms; Clustering methods; Data models; Kernel; Labeling; Training; Kernel Density Estimation; Nearest Neighbor Classifier; Pairwise Clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications (ICMLA), 2013 12th International Conference on
Conference_Location :
Miami, FL
Type :
conf
DOI :
10.1109/ICMLA.2013.188
Filename :
6786105
Link To Document :
بازگشت