DocumentCode
659598
Title
MapReduce implementation of Variational Bayesian Probabilistic Matrix Factorization algorithm
Author
Tewari, Naveen C. ; Koduvely, Hari M. ; Guha, Saikat ; Yadav, Ankesh ; David, Gladbin
Author_Institution
Center for Knowledge Driven Intell. Syst., Infosys Ltd., Bangalore, India
fYear
2013
fDate
6-9 Oct. 2013
Firstpage
145
Lastpage
152
Abstract
We introduce in this paper a scalable implementation of Variational Bayesian Matrix Factorization method for collaborative filtering using the MapReduce framework. Variational Bayesian methods have the advantage of providing good approximate analytical solutions for the posterior distribution. Due to the independence assumption about the parameters in the posterior distribution, variational methods are also likely to be able to parallelize efficiently. Though Variational Bayesian Matrix Factorization method has shown to produce more accurate results in collaborative filtering, its scaling properties have not studied so far. We ran our MapReduce implementation on the CiteULike data set and show that our parallelization scheme achieves approximately linear scaling. We also compare its performance with the MapReduce implementation of a popular matrix factorization algorithm, ALSWR, from the open source machine learning library Mahout.
Keywords
Bayes methods; collaborative filtering; distributed processing; matrix decomposition; variational techniques; ALSWR; CiteULike data set; Mahout; MapReduce framework; MapReduce implementation; collaborative filtering; open source machine learning library; parallelization scheme; posterior distribution; variational Bayesian probabilistic matrix factorization algorithm; Approximation methods; Bayes methods; Cost function; Equations; Indexes; Niobium; Sparse matrices; Collaborative Filtering; Distributed Computing; MapReduce; Probabilistic Matrix Factorization; Recommendation Systems; Variational Bayesian Matrix Factorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Big Data, 2013 IEEE International Conference on
Conference_Location
Silicon Valley, CA
Type
conf
DOI
10.1109/BigData.2013.6691747
Filename
6691747
Link To Document