• DocumentCode
    2453767
  • Title

    Pairwise Constrained Clustering with Group Similarity-Based Patterns

  • Author

    Hu, Tianming ; Liu, Chuanren ; Sun, Jing ; Sung, Sam Yuan ; Ng, Peter A.

  • fYear
    2010
  • fDate
    12-14 Dec. 2010
  • Firstpage
    260
  • Lastpage
    265
  • Abstract
    Conventional k-means only considers pair wise similarity during cluster assignment, which aims to minimizing the distance of points to their nearest cluster centroids. In high dimensional space like document datasets, however, two points may be nearest neighbors without belonging to the same class. Thus pair wise similarity alone is often insufficient for class prediction in such space. To that end, in this paper, we propose to augment k-means with pair wise constraints generated from group similarity-based hyper clique patterns, which consist of strongly affiliated objects and serve as more reliable seeds for classification. Experiments with real-world datasets show that, with such constraints from quality hyper clique patterns, we can improve the clustering results in terms of various external criteria. Also, our experiments indicate that even if few constraints are violated in the original result of k-means, imposing many quality constraints may still bring gain of performance.
  • Keywords
    pattern classification; pattern clustering; class prediction; classification; cluster assignment; document dataset; group similarity-based hyper clique pattern; k-means; nearest cluster centroid; nearest neighbor; pairwise constrained clustering; pairwise similarity; quality constraint; Artificial neural networks; Clustering algorithms; Data mining; Entropy; Itemsets; Nearest neighbor searches; Wireless application protocol; constrained clustering; hyperclique patterns; k-means; pairwise constraints;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications (ICMLA), 2010 Ninth International Conference on
  • Conference_Location
    Washington, DC
  • Print_ISBN
    978-1-4244-9211-4
  • Type

    conf

  • DOI
    10.1109/ICMLA.2010.45
  • Filename
    5708842