Title :
Enhancing Accuracy and Performance of Collaborative Filtering Algorithm by Stochastic SVD and Its MapReduce Implementation
Author :
Che-Rung Lee ; Ya-Fang Chang
Author_Institution :
Dept. of Comput. Sci., Nat. Tsing Hua Univ., Hsinchu, Taiwan
Abstract :
Collaborative filtering algorithms that extract desired information from records have been widely used in data mining and information retrieval, such as recommendation systems. However, the rapidly increased data size demands more efficient and scalable algorithms and implementations. In this paper, we present a novel algorithm that utilizes stochastic singular value decomposition (SSVD) in the calculation of item-based collaborative filtering. The use of SSVD does not only provide more accurate results in terms of precision and recall, but also reduces the computational cost. The proposed algorithm was implemented using Hadoop MapReduce, which allows distributed processing of massive data stored in a distributed file system. The implementation was evaluated and compared with the recommendation systems provided in the Apache Mahout project, and a 2.53 speedup can be obtained for processing millions records. The accuracy of our algorithm is also 3 times better than the non-SVD algorithm in terms of the F1 metric, a combinative measurement of precision and recall.
Keywords :
collaborative filtering; data mining; distributed databases; parallel algorithms; singular value decomposition; stochastic processes; Hadoop; MapReduce; SSVD; computational cost reduction; data mining; distributed file system; distributed massive data processing; information extraction; information retrieval; item-based collaborative filtering; singular value decomposition; stochastic SVD; Approximation algorithms; Approximation methods; Collaboration; Filtering; Filtering algorithms; Matrix decomposition; Singular value decomposition; Apache Mahout; Collaborative filtering; MapReduce; recommendation system; stocastic singular value decomposition;
Conference_Titel :
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International
Conference_Location :
Cambridge, MA
Print_ISBN :
978-0-7695-4979-8
DOI :
10.1109/IPDPSW.2013.120