• DocumentCode
    969622
  • Title

    On weighting clustering

  • Author

    Nock, R. ; Nielsen, F.

  • Author_Institution
    Departement Sci. Inter-facultaire, Univ. Antilles-Guyane, Martinique
  • Volume
    28
  • Issue
    8
  • fYear
    2006
  • Firstpage
    1223
  • Lastpage
    1235
  • Abstract
    Recent papers and patents in iterative unsupervised learning have emphasized a new trend in clustering. It basically consists of penalizing solutions via weights on the instance points, somehow making clustering move toward the hardest points to cluster. The motivations come principally from an analogy with powerful supervised classification methods known as boosting algorithms. However, interest in this analogy has so far been mainly borne out from experimental studies only. This paper is, to the best of our knowledge, the first attempt at its formalization. More precisely, we handle clustering as a constrained minimization of a Bregman divergence. Weight modifications rely on the local variations of the expected complete log-likelihoods. Theoretical results show benefits resembling those of boosting algorithms and bring modified (weighted) versions of clustering algorithms such as k-means, fuzzy c-means, expectation maximization (EM), and k-harmonic means. Experiments are provided for all these algorithms, with a readily available code. They display the advantages that subtle data reweighting may bring to clustering
  • Keywords
    expectation-maximisation algorithm; fuzzy set theory; minimisation; pattern classification; pattern clustering; unsupervised learning; Bregman divergence; boosting algorithms; complete log-likelihoods; constrained minimization; data reweighting; expectation maximization clustering algorithm; fuzzy c-means clustering algorithm; instance points; iterative unsupervised learning; k-harmonic means clustering algorithm; k-means clustering algorithm; supervised classification methods; weight modifications; weighting clustering; Algorithm design and analysis; Boosting; Clustering algorithms; Design methodology; Displays; Iterative algorithms; Minimization methods; Tail; Taylor series; Unsupervised learning; Bregman divergences; Clustering; expectation maximization; fuzzy khbox{-}rm means; harmonic means clustering.; khbox{-}rm means; Algorithms; Artificial Intelligence; Cluster Analysis; Computer Simulation; Information Storage and Retrieval; Models, Statistical; Pattern Recognition, Automated;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2006.168
  • Filename
    1642658