• DocumentCode
    477656
  • Title

    A Weight Entropy k-Means Algorithm for Clustering Dataset with Mixed Numeric and Categorical Data

  • Author

    Li, Taoying ; Chen, Yan

  • Author_Institution
    Sch. of Econ. & Manage., Dalian Maritime Univ., Dalian
  • Volume
    1
  • fYear
    2008
  • fDate
    18-20 Oct. 2008
  • Firstpage
    36
  • Lastpage
    41
  • Abstract
    Traditional k-means algorithm can make the distances of objects in the same cluster as small as possible, but the distances of objects from different clusters are not satisfied efficiently and usually the dataset with mixed numeric and categorical data is not classified correctly. The IWEKM (improved weight entropy k-means) algorithm is proposed in this paper. The proposed algorithm overcomes the above problems by modifying the cost function of entropy weighting k-means clustering algorithm by adding a variable that is relevant linearly to the square sum of distances from the mean of all objects and the means of all clusters and a variable that is relevant to relativity degree of categorical data. The results of different clustering algorithms applied on Iris data and Flag data show that the proposed algorithm is efficient.
  • Keywords
    entropy; pattern clustering; Flag data; IWEKM; Iris data; categorical data; cost function; numeric data; weight entropy k-means algorithm; Clustering algorithms; Conference management; Cost function; Entropy; Fuzzy systems; Iris; Knowledge management; Partitioning algorithms; Utility programs; clustering; k-means algorithm; partition clustering; weight entropy;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
  • Conference_Location
    Shandong
  • Print_ISBN
    978-0-7695-3305-6
  • Type

    conf

  • DOI
    10.1109/FSKD.2008.32
  • Filename
    4665935