• DocumentCode
    3261650
  • Title

    An Improved Genetic k-means Algorithm for Optimal Clustering

  • Author

    Guo, Hai-xiang ; Zhu, Ke-Jun ; Gao, Si-wei ; Liu, Ting

  • Author_Institution
    Sch. of Manage., China Univ. of Geosciences, Wuhan
  • fYear
    2006
  • fDate
    Dec. 2006
  • Firstpage
    793
  • Lastpage
    797
  • Abstract
    In the classical k-means algorithm, the value of k must be confirmed in advance. It is difficult to confirm accurately the value of k in reality. This paper proposes an improved genetic k-means algorithm (IGKM) and constructs a fitness function defined as a product of three factors, maximization of which ensures the formation of a small number of compact clusters with large separation between at least two clusters. At last, two artificial and three real-life data sets are considered for experiments that compare IGKM with k-means algorithm, GA-based method and genetic k-means algorithm (GKM) by inter-cluster distance (ITD), inner-cluster distance (IND) and rate of separation exactness. The experiments show that IGKM can automatically reach the optimal value of k with high accuracy
  • Keywords
    data mining; pattern clustering; IGKM; data sets; fitness function; improved genetic k-means algorithm; inner-cluster distance; inter-cluster distance; optimal clustering; separation exactness; Biological cells; Clustering algorithms; Conferences; Data mining; Genetics; Geology; Geoscience; Information technology; Partitioning algorithms; Physics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops, 2006. ICDM Workshops 2006. Sixth IEEE International Conference on
  • Conference_Location
    Hong Kong
  • Print_ISBN
    0-7695-2702-7
  • Type

    conf

  • DOI
    10.1109/ICDMW.2006.30
  • Filename
    4063733