• DocumentCode
    389624
  • Title

    MSGKA: an efficient clustering algorithm for large databases

  • Author

    Tsai, Cheng-Fa ; Chen, Zhi-Cheng ; Tsai, Chun-Wei

  • Author_Institution
    Dept. of Manage. Inf. Syst., Nat. Pingtung Univ. of Sci. & Technol., Taiwan
  • Volume
    5
  • fYear
    2002
  • fDate
    6-9 Oct. 2002
  • Abstract
    This investigation presents an efficient clustering algorithm for large databases. We present a novel multiple-searching genetic algorithm (MSGA) that finds a globally optimal partition of a given data into a specified number of clusters. We hybridize MSGA with a multiple-searching approach utilized in clustering namely, K-means algorithm. Hence, the name multiple-searching genetic K-means algorithm (MSGKA). Our simulation results reveal that the proposed novel clustering approach performs better than the Fast SOM combines K-means approach (FSOM+K-means) and Genetic K-Means Algorithm (GKA). Moreover, in all the cases we studied, our approach produces much smaller errors than both the FSOM+K-means and GKA.
  • Keywords
    data mining; database theory; genetic algorithms; pattern clustering; search problems; very large databases; Fast SOM combined K-means approach; Genetic K-Means Algorithm; K-means algorithm; MSGKA; clustering algorithm; data mining; errors; large databases; multiple-searching genetic K-means algorithm; multiple-searching genetic algorithm; simulation; Biological cells; Clustering algorithms; Costs; Data mining; Databases; Genetic algorithms; Machine learning; Partitioning algorithms; Pattern recognition; Statistics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics, 2002 IEEE International Conference on
  • ISSN
    1062-922X
  • Print_ISBN
    0-7803-7437-1
  • Type

    conf

  • DOI
    10.1109/ICSMC.2002.1176400
  • Filename
    1176400