DocumentCode :
2984595
Title :
Distributed Matrix Completion
Author :
Teflioudi, C. ; Makari, F. ; Gemulla, R.
Author_Institution :
Max-Planck-Inst. fur Inf., Saarbrucken, Germany
fYear :
2012
fDate :
10-13 Dec. 2012
Firstpage :
655
Lastpage :
664
Abstract :
We discuss parallel and distributed algorithms for large-scale matrix completion on problems with millions of rows, millions of columns, and billions of revealed entries. We focus on in-memory algorithms that run on a small cluster of commodity nodes, even very large problems can be handled effectively in such a setup. Our DALS, ASGD, and DSGD++ algorithms are novel variants of the popular alternating least squares and stochastic gradient descent algorithms, they exploit thread-level parallelism, in-memory processing, and asynchronous communication. We provide some guidance on the asymptotic performance of each algorithm and investigate the performance of both our algorithms and previously proposed Map Reduce algorithms in large-scale experiments. We found that DSGD++ outperforms competing methods in terms of overall runtime, memory consumption, and scalability. Using DSGD++, we can factor a matrix with 10B entries on 16 compute nodes in around 40 minutes.
Keywords :
data mining; gradient methods; least squares approximations; parallel algorithms; ASGD algorithm; DALS algorithm; DSGD++ algorithm; Map Reduce algorithm; alternating least squares algorithm; asymptotic performance; asynchronous communication; commodity node; data mining; distributed algorithm; distributed matrix completion; in-memory algorithm; in-memory processing; large-scale matrix completion; memory consumption; parallel algorithm; stochastic gradient descent algorithm; thread-level parallelism; Algorithm design and analysis; Clustering algorithms; Convergence; Distributed algorithms; Instruction sets; Schedules; Training; ALS; parallel and distributed matrix factorization; recommender systems; stochastic gradient descent;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining (ICDM), 2012 IEEE 12th International Conference on
Conference_Location :
Brussels
ISSN :
1550-4786
Print_ISBN :
978-1-4673-4649-8
Type :
conf
DOI :
10.1109/ICDM.2012.120
Filename :
6413862
Link To Document :
بازگشت