Title :
Fully distributed robust singular value decomposition
Author :
Hegedus, Istvan ; Jelasity, Mark ; Kocsis, Levente ; Benczur, Andras A.
Author_Institution :
Univ. of Szeged, Szeged, Hungary
Abstract :
Low-rank matrix approximation is an important tool in data mining with a wide range of applications including recommender systems, clustering, and identifying topics in documents. The problem we tackle is implementing singular value decomposition (SVD)-a popular method for low rank approximation in large fully distributed P2P systems in a robust and scalable manner. We assume that the matrix to be approximated is stored in a large network where each node knows one row of the matrix (personal attributes, documents, media ratings, etc). In our P2P model, we do not allow this personal information to leave the node, yet we want the nodes to collaboratively compute the SVD. Methods applied in large scale distributed systems such as synchronized parallel gradient search or distributed iterative methods are not preferable in our system model due to their requirements of synchronized rounds or their inherent issues with load balancing. Our approach overcomes these limitations with the help of a distributed stochastic gradient search in which the personal part of the decomposition remains local, and the global part (e.g., movie features) converges at all nodes to the correct value. We present a theoretical derivation of our algorithm, as well as a thorough experimental evaluation of real and synthetic data as well. We demonstrate that the convergence speed of our method is competitive while not relying on synchronization and being robust to extreme failure scenarios.
Keywords :
data mining; peer-to-peer computing; resource allocation; search problems; singular value decomposition; SVD; clustering; data mining; distributed iterative methods; distributed stochastic gradient search; document topic identification; extreme failure scenarios; fully distributed P2P systems; fully distributed robust singular value decomposition; load balancing; low-rank matrix approximation; personal information; recommender systems; synchronized parallel gradient search; synchronized rounds; synthetic data; Approximation algorithms; Approximation methods; Convergence; Matrix decomposition; Peer-to-peer computing; Synchronization; Vectors; data mining; matrix factorization; online learning; privacy; singular value decomposition; stochastic gradient descent;
Conference_Titel :
Peer-to-Peer Computing (P2P), 14-th IEEE International Conference on
Conference_Location :
London
Print_ISBN :
978-1-4799-6200-6
DOI :
10.1109/P2P.2014.6934299