• DocumentCode
    2909746
  • Title

    Probabilistic and Graphical Model based Genetic Algorithm Driven Clustering with Instance-level Constraints

  • Author

    Hong, Yi ; Kwong, Sam ; Wang, Hanli ; Ren, Qingsheng ; Chang, Yuchou

  • Author_Institution
    Dept. of Comput. Sci., City Univ. of Hong Kong, Kowloon
  • fYear
    2008
  • fDate
    1-6 June 2008
  • Firstpage
    322
  • Lastpage
    329
  • Abstract
    Clustering is traditionally viewed as an unsupervised method for data analysis. However, several recent studies have shown that some limited prior instance-level knowledge can significantly improve the performance of clustering algorithm. This paper proposes a semi-supervised clustering algorithm termed as the probabilistic and graphical model based genetic algorithm driven clustering with instance-level constraints (Cop-CGA). In Cop-CGA, all prior knowledge about pairs of instances that should or should not be classified into the same groups is denoted as a graph and all candidate clustering solutions are sampled from this graph with different orders to assign instances into a certain number of groups. We illustrate how to design the Cop-CGA to guarantee that all candidate solutions satisfy the given constraints and demonstrate the usefulness of background knowledge for genetic algorithm driven clustering algorithm through experiments on several real data sets with artificial hard constraints. One advantage of Cop-CGA is both positive and negative instance-level constraints can be easily incorporated. Moreover, the performance of Cop-CGA is not sensitive to the order of assignment of instances to groups.
  • Keywords
    genetic algorithms; graph theory; pattern clustering; probability; clustering algorithm; data analysis; genetic algorithm driven clustering; instance-level constraints; instance-level knowledge; semisupervised clustering algorithm; unsupervised method; Algorithm design and analysis; Clustering algorithms; Computer science; Data analysis; Genetic algorithms; Graphical models; Image retrieval; Iterative algorithms; Partitioning algorithms; Pattern analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Evolutionary Computation, 2008. CEC 2008. (IEEE World Congress on Computational Intelligence). IEEE Congress on
  • Conference_Location
    Hong Kong
  • Print_ISBN
    978-1-4244-1822-0
  • Electronic_ISBN
    978-1-4244-1823-7
  • Type

    conf

  • DOI
    10.1109/CEC.2008.4630817
  • Filename
    4630817