Title :
ADMM based scalable machine learning on Spark
Author :
Sauptik Dhar;Congrui Yi;Naveen Ramakrishnan;Mohak Shah
Author_Institution :
Research and Technology Center, Robert Bosch LLC, Palo Alto, CA 94304, USA
Abstract :
Most machine learning algorithms involve solving a convex optimization problem. Traditional in-memory convex optimization solvers do not scale well with the increase in data. This paper identifies a generic convex problem for most machine learning algorithms and solves it using the Alternating Direction Method of Multipliers (ADMM). Finally such an ADMM problem transforms to an iterative system of linear equations, which can be easily solved at scale in a distributed fashion. We implement this framework in Apache Spark and compare it with the widely used Machine Learning LIBrary (MLLIB) in Apache Spark 1.3.
Keywords :
"Machine learning algorithms","Optimization","Sparks","Loss measurement","Distributed databases","Convex functions","Big data"
Conference_Titel :
Big Data (Big Data), 2015 IEEE International Conference on
DOI :
10.1109/BigData.2015.7363871