• DocumentCode
    3438369
  • Title

    Decentralized K-Means Using Randomized Gossip Protocols for Clustering Large Datasets

  • Author

    Fellus, Jerome ; Picard, David ; Gosselin, Philippe-Henri

  • Author_Institution
    ETIS, ENSEA/Univ. de Cergy-Pontoise, Cergy, France
  • fYear
    2013
  • fDate
    7-10 Dec. 2013
  • Firstpage
    599
  • Lastpage
    606
  • Abstract
    In this paper, we consider the clustering of very large datasets distributed over a network of computational units using a decentralized K-means algorithm. To obtain the same codebook at each node of the network, we use a randomized gossip aggregation protocol where only small messages are exchanged. We theoretically show the equivalence of the algorithm with a centralized K-means, provided a bound on the number of messages each node has to send is met. We provide experiments showing that the consensus is reached for a number of messages consistent with the bound, but also for a smaller number of messages, albeit with a less smooth evolution of the objective function.
  • Keywords
    distributed processing; optimisation; pattern clustering; randomised algorithms; centralized k-means algorithm; codebook; computational units; decentralized k-means algorithm; message exchange; network node; objective function; randomized gossip aggregation protocol; very-large dataset clustering; Clustering algorithms; Convergence; Data models; Optimization; Partitioning algorithms; Protocols; Vectors; Distributed clustering; randomized gossip protocols;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops (ICDMW), 2013 IEEE 13th International Conference on
  • Conference_Location
    Dallas, TX
  • Print_ISBN
    978-1-4799-3143-9
  • Type

    conf

  • DOI
    10.1109/ICDMW.2013.58
  • Filename
    6753975