DocumentCode :
838413
Title :
Optimal cluster preserving embedding of nonmetric proximity data
Author :
Roth, Volker ; Laub, Julian ; Kawanabe, Motoaki ; Buhmann, Joachim M.
Author_Institution :
Dept. of Comput. Sci., Bonn Univ., Germany
Volume :
25
Issue :
12
fYear :
2003
Firstpage :
1540
Lastpage :
1551
Abstract :
For several major applications of data analysis, objects are often not represented as feature vectors in a vector space, but rather by a matrix gathering pairwise proximities. Such pairwise data often violates metricity and, therefore, cannot be naturally embedded in a vector space. Concerning the problem of unsupervised structure detection or clustering, in this paper, a new embedding method for pairwise data into Euclidean vector spaces is introduced. We show that all clustering methods, which are invariant under additive shifts of the pairwise proximities, can be reformulated as grouping problems in Euclidian spaces. The most prominent property of this constant shift embedding framework is the complete preservation of the cluster structure in the embedding space. Restating pairwise clustering problems in vector spaces has several important consequences, such as the statistical description of the clusters by way of cluster prototypes, the generic extension of the grouping procedure to a discriminative prediction rule, and the applicability of standard preprocessing methods like denoising or dimensionality reduction.
Keywords :
data analysis; optimal systems; optimisation; pattern clustering; Euclidean vector spaces; cluster prototypes; cluster structure; clustering methods; data analysis; denoising; dimensionality reduction; discriminative prediction rule; embedding method; embedding space; feature vectors; generic extension; grouping problems; grouping procedure; matrix gathering pairwise proximities; nonmetric proximity data; optimal cluster; pairwise clustering problems; pairwise data; pairwise proximities; prominent property; standard preprocessing methods; statistical description; unsupervised structure detection; Additives; Clustering algorithms; Clustering methods; Cost function; Data analysis; Data mining; Extraterrestrial measurements; Genomics; Noise reduction; Prototypes;
fLanguage :
English
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher :
ieee
ISSN :
0162-8828
Type :
jour
DOI :
10.1109/TPAMI.2003.1251147
Filename :
1251147
Link To Document :
بازگشت