Kernel-Based Clustering with Automatic Cluster Number Selection

Author

Wang, Chang-Dong ; Lai, Jian-Huang ; Huang, Dong

Author_Institution

Sch. of Inf. Sci. & Technol., Sun Yat-sen Univ., Guangzhou, China

fYear

2011

fDate

11-11 Dec. 2011

Firstpage

293

Lastpage

299

Abstract

Kernel k-means is one of the most well-known kernel-based clustering methods for discovering nonlinearly separable clusters. However, like its original counterpart k-means, kernel k-means has two inherent drawbacks: (1) it is easily trapped into degenerate local minima when the prototypes of clusters are ill-initialized, and (2) the actual number of clusters has to be provided in advance. Although some algorithms have been proposed to handle the first problem, there is still a lack of methods for automatically estimating the number of clusters in kernel space. In this paper, inspired by the on-line learning framework and the rival penalization mechanism, we propose a novel kernel-based clustering method with automatic cluster number selection (KeCans for short). In KeCans, prototypes are represented by a prototype descriptor, which is a real-valued matrix with each row representing a prototype. The prototype descriptor is allocated with more than the actual number of rows in initialization. Rival penalization is utilized in competition process to eliminate the redundant rows. Experimental results demonstrate the effectiveness of the proposed method in revealing the real number of clusters in kernel space. And compared with the state-of-the-art kernel-based clustering algorithms, the proposed method achieves comparable clustering results.

Keywords

learning (artificial intelligence); matrix algebra; pattern clustering; KeCans; automatic cluster number selection; kernel k-means; kernel space; kernel-based clustering methods; online learning framework; prototype descriptor; real-valued matrix; rival penalization; Arrays; Clustering algorithms; Clustering methods; Convergence; Indexes; Kernel; Prototypes; cluster number selection; data clustering; kernel-based clustering; on-line learning; rival penalization;

fLanguage

English

Publisher

ieee

Conference_Titel

Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on

Conference_Location

Vancouver, BC

Print_ISBN

978-1-4673-0005-6

Type

conf

DOI

10.1109/ICDMW.2011.107

Filename

6137393