• DocumentCode
    1634608
  • Title

    A genetic rule-based data clustering toolkit

  • Author

    Sarafis, I. ; Zalzala, Ams ; Trinder, P.W.

  • Author_Institution
    Dept. of Comput. & Electr. Eng., Heriot-Watt Univ., Edinburgh, UK
  • Volume
    2
  • fYear
    2002
  • fDate
    6/24/1905 12:00:00 AM
  • Firstpage
    1238
  • Lastpage
    1243
  • Abstract
    Clustering is a hard combinatorial problem and is defined as the unsupervised classification of patterns. The formation of clusters is based on the principle of maximizing the similarity between objects of the same cluster while simultaneously minimizing the similarity between objects belonging to distinct clusters. This paper presents a tool for database clustering using a rule-based genetic algorithm (RBCGA). RBCGA evolves individuals consisting of a fixed set of clustering rules, where each rule includes d non-binary intervals, one for each feature. The investigations attempt to alleviate certain drawbacks related to the classical minimization of square-error criterion by suggesting a flexible fitness function which takes into consideration, cluster asymmetry, density, coverage and homogeny
  • Keywords
    data mining; database theory; genetic algorithms; knowledge based systems; least mean squares methods; pattern clustering; very large databases; cluster asymmetry; combinatorial problem; data mining; database clustering; flexible fitness function; genetic rule-based data clustering toolkit; huge databases; minimization; nonbinary intervals; object similarity; rule-based genetic algorithm; square-error criterion; unsupervised pattern classification; Clustering algorithms; Computational efficiency; Data analysis; Data mining; Delta modulation; Genetic algorithms; Gravity; Multidimensional systems; Partitioning algorithms; Spatial databases;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Evolutionary Computation, 2002. CEC '02. Proceedings of the 2002 Congress on
  • Conference_Location
    Honolulu, HI
  • Print_ISBN
    0-7803-7282-4
  • Type

    conf

  • DOI
    10.1109/CEC.2002.1004420
  • Filename
    1004420