• DocumentCode
    2770888
  • Title

    Cross-Guided Clustering: Transfer of Relevant Supervision across Domains for Improved Clustering

  • Author

    Bhattacharya, Indrajit ; Godbole, Shantanu ; Joshi, Sachindra ; Verma, Ashish

  • Author_Institution
    IBM Res., India
  • fYear
    2009
  • fDate
    6-9 Dec. 2009
  • Firstpage
    41
  • Lastpage
    50
  • Abstract
    Lack of supervision in clustering algorithms often leads to clusters that are not useful or interesting to human reviewers. We investigate if supervision can be automatically transferred to a clustering task in a target domain, by providing a relevant supervised partitioning of a dataset from a different source domain. The target clustering is made more meaningful for the human user by trading off intrinsic clustering goodness on the target dataset for alignment with relevant supervised partitions in the source dataset, wherever possible. We propose a cross-guided clustering algorithm that builds on traditional k-means by aligning the target clusters with source partitions. The alignment process makes use of a cross-domain similarity measure that discovers hidden relationships across domains with potentially different vocabularies. Using multiple real-world datasets, we show that our approach improves clustering accuracy significantly over traditional k-means.
  • Keywords
    pattern clustering; cross-domain similarity measure; cross-guided clustering; improved clustering; intrinsic clustering; supervised partitioning; Automobiles; Clustering algorithms; Costs; Data mining; Humans; Partitioning algorithms; Personnel; Training data; Unsupervised learning; Vocabulary; Clustering methods; Relationship Discovery; Transfer Learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2009. ICDM '09. Ninth IEEE International Conference on
  • Conference_Location
    Miami, FL
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4244-5242-2
  • Electronic_ISBN
    1550-4786
  • Type

    conf

  • DOI
    10.1109/ICDM.2009.33
  • Filename
    5360229