• DocumentCode
    3717252
  • Title

    ADMM based scalable machine learning on Spark

  • Author

    Sauptik Dhar;Congrui Yi;Naveen Ramakrishnan;Mohak Shah

  • Author_Institution
    Research and Technology Center, Robert Bosch LLC, Palo Alto, CA 94304, USA
  • fYear
    2015
  • Firstpage
    1174
  • Lastpage
    1182
  • Abstract
    Most machine learning algorithms involve solving a convex optimization problem. Traditional in-memory convex optimization solvers do not scale well with the increase in data. This paper identifies a generic convex problem for most machine learning algorithms and solves it using the Alternating Direction Method of Multipliers (ADMM). Finally such an ADMM problem transforms to an iterative system of linear equations, which can be easily solved at scale in a distributed fashion. We implement this framework in Apache Spark and compare it with the widely used Machine Learning LIBrary (MLLIB) in Apache Spark 1.3.
  • Keywords
    "Machine learning algorithms","Optimization","Sparks","Loss measurement","Distributed databases","Convex functions","Big data"
  • Publisher
    ieee
  • Conference_Titel
    Big Data (Big Data), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/BigData.2015.7363871
  • Filename
    7363871