Data clustering using evidence accumulation

Author

Fred, Ana L N ; Jain, Anil K.

Author_Institution

Telecommun. Inst., Instituto Superior Tecnico, Lisbon, Portugal

Volume

4

fYear

2002

fDate

2002

Firstpage

276

Abstract

We explore the idea of evidence accumulation for combining the results of multiple clusterings. Initially, n d-dimensional data is decomposed into a large number of compact clusters; the K-means algorithm performs this decomposition, with several clusterings obtained by N random initializations of the K-means. Taking the co-occurrences of pairs of patterns in the same cluster as votes for their association, the data partitions are mapped into a co-association matrix of patterns. This n×n matrix represents a new similarity measure between patterns. The final clusters are obtained by applying a MST-based clustering algorithm on this matrix. Results on both synthetic and real data show the ability of the method to identify arbitrary shaped clusters in multidimensional data.

Keywords

matrix algebra; pattern clustering; K-means algorithm; MST-based clustering algorithm; co-association matrix; compact clusters; data clustering; data partitions; evidence accumulation; multidimensional data; random initializations; similarity measure; Bagging; Boosting; Clustering algorithms; Computer science; Matrix decomposition; Multidimensional systems; Partitioning algorithms; Shape measurement; Unsupervised learning; Voting;

fLanguage

English

Publisher

ieee

Conference_Titel

Pattern Recognition, 2002. Proceedings. 16th International Conference on

ISSN

1051-4651

Print_ISBN

0-7695-1695-X

Type

conf

DOI

10.1109/ICPR.2002.1047450

Filename

1047450