• DocumentCode
    2962028
  • Title

    Initialization of k-modes clustering for categorical data

  • Author

    Li Tao-Ying ; Chen Yan ; Jin Zhi-hong ; Li Ye

  • Author_Institution
    Transp. Manage. Coll., Dalian Maritime Univ., Dalian, China
  • fYear
    2013
  • fDate
    17-19 July 2013
  • Firstpage
    107
  • Lastpage
    112
  • Abstract
    The k-modes clustering algorithm is undoubtedly one of the most widely used partitional algorithms for categorical data. Unfortunately, due to its gradient descent nature, this algorithm is highly sensitive to the initialization of clustering. Categorical initialization methods have been proposed to address this problem. In this paper, we present an overview of initialization methods of clustering for numerical data and categorical data respectively with an emphasis on their computational efficiency. We then propose a new initialization method for categorical data, which can obtain the good initial cluster centers using the new distance base on the RD, and explore the methods of density and grid. Finally, proposed method has been tested on diagnosis dataset, a real world data set from UCI Machine Learning Repository, and been analyzed the experimental results, which illustrates that the proposed method is effective and efficient for initializing categorical data.
  • Keywords
    gradient methods; pattern clustering; categorical data clustering; categorical initialization methods; cluster centers; computational efficiency; gradient descent nature; k-modes clustering algorithm; numerical data clustering; partitional algorithms; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Computational efficiency; Computational modeling; Pain; Partitioning algorithms; categorical data; density and grid measure; initialization of clustering; the k-modes clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Management Science and Engineering (ICMSE), 2013 International Conference on
  • Conference_Location
    Harbin
  • ISSN
    2155-1847
  • Print_ISBN
    978-1-4799-0473-0
  • Type

    conf

  • DOI
    10.1109/ICMSE.2013.6586269
  • Filename
    6586269