• Title of article

    Towards effective document clustering: A constrained K-means based approach

  • Author/Authors

    Guobiao Hu، نويسنده , , Shuigeng Zhou ، نويسنده , , Jihong Guan ، نويسنده , , Xiaohua Hu & Nick Cercone، نويسنده ,

  • Issue Information
    دوماهنامه با شماره پیاپی سال 2008
  • Pages
    13
  • From page
    1397
  • To page
    1409
  • Abstract
    Document clustering is an important tool for document collection organization and browsing. In real applications, some limited knowledge about cluster membership of a small number of documents is often available, such as some pairs of documents belonging to the same cluster. This kind of prior knowledge can be served as constraints for the clustering process. We integrate the constraints into the trace formulation of the sum of square Euclidean distance function of K-means. Then,the combined criterion function is transformed into trace maximization, which is further optimized by eigen-decomposition. Our experimental evaluation shows that the proposed semi-supervised clustering method can achieve better performance, compared to three existing methods.
  • Keywords
    Document clustering , semi-supervised learning , Spectral relaxation , Clustering with prior knowledge
  • Journal title
    Information Processing and Management
  • Serial Year
    2008
  • Journal title
    Information Processing and Management
  • Record number

    1228834