DocumentCode
3438369
Title
Decentralized K-Means Using Randomized Gossip Protocols for Clustering Large Datasets
Author
Fellus, Jerome ; Picard, David ; Gosselin, Philippe-Henri
Author_Institution
ETIS, ENSEA/Univ. de Cergy-Pontoise, Cergy, France
fYear
2013
fDate
7-10 Dec. 2013
Firstpage
599
Lastpage
606
Abstract
In this paper, we consider the clustering of very large datasets distributed over a network of computational units using a decentralized K-means algorithm. To obtain the same codebook at each node of the network, we use a randomized gossip aggregation protocol where only small messages are exchanged. We theoretically show the equivalence of the algorithm with a centralized K-means, provided a bound on the number of messages each node has to send is met. We provide experiments showing that the consensus is reached for a number of messages consistent with the bound, but also for a smaller number of messages, albeit with a less smooth evolution of the objective function.
Keywords
distributed processing; optimisation; pattern clustering; randomised algorithms; centralized k-means algorithm; codebook; computational units; decentralized k-means algorithm; message exchange; network node; objective function; randomized gossip aggregation protocol; very-large dataset clustering; Clustering algorithms; Convergence; Data models; Optimization; Partitioning algorithms; Protocols; Vectors; Distributed clustering; randomized gossip protocols;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining Workshops (ICDMW), 2013 IEEE 13th International Conference on
Conference_Location
Dallas, TX
Print_ISBN
978-1-4799-3143-9
Type
conf
DOI
10.1109/ICDMW.2013.58
Filename
6753975
Link To Document