Title :
Initialization of k-modes clustering for categorical data
Author :
Li Tao-Ying ; Chen Yan ; Jin Zhi-hong ; Li Ye
Author_Institution :
Transp. Manage. Coll., Dalian Maritime Univ., Dalian, China
Abstract :
The k-modes clustering algorithm is undoubtedly one of the most widely used partitional algorithms for categorical data. Unfortunately, due to its gradient descent nature, this algorithm is highly sensitive to the initialization of clustering. Categorical initialization methods have been proposed to address this problem. In this paper, we present an overview of initialization methods of clustering for numerical data and categorical data respectively with an emphasis on their computational efficiency. We then propose a new initialization method for categorical data, which can obtain the good initial cluster centers using the new distance base on the RD, and explore the methods of density and grid. Finally, proposed method has been tested on diagnosis dataset, a real world data set from UCI Machine Learning Repository, and been analyzed the experimental results, which illustrates that the proposed method is effective and efficient for initializing categorical data.
Keywords :
gradient methods; pattern clustering; categorical data clustering; categorical initialization methods; cluster centers; computational efficiency; gradient descent nature; k-modes clustering algorithm; numerical data clustering; partitional algorithms; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Computational efficiency; Computational modeling; Pain; Partitioning algorithms; categorical data; density and grid measure; initialization of clustering; the k-modes clustering;
Conference_Titel :
Management Science and Engineering (ICMSE), 2013 International Conference on
Conference_Location :
Harbin
Print_ISBN :
978-1-4799-0473-0
DOI :
10.1109/ICMSE.2013.6586269