• DocumentCode
    1800051
  • Title

    A scalable machine learning online service for big data real-time analysis

  • Author

    Baldominos, Alejandro ; Albacete, Esperanza ; Saez, Yago ; Isasi, Pedro

  • Author_Institution
    Comput. Sci. Dept., Univ. Carlos III de Madrid, Leganes, Spain
  • fYear
    2014
  • fDate
    9-12 Dec. 2014
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    This work describes a proposal for developing and testing a scalable machine learning architecture able to provide real-time predictions or analytics as a service over domain-independent big data, working on top of the Hadoop ecosystem and providing real-time analytics as a service through a RESTful API. Systems implementing this architecture could provide companies with on-demand tools facilitating the tasks of storing, analyzing, understanding and reacting to their data, either in batch or stream fashion; and could turn into a valuable asset for improving the business performance and be a key market differentiator in this fast pace environment. In order to validate the proposed architecture, two systems are developed, each one providing classical machine-learning services in different domains: the first one involves a recommender system for web advertising, while the second consists in a prediction system which learns from gamers´ behavior and tries to predict future events such as purchases or churning. An evaluation is carried out on these systems, and results show how both services are able to provide fast responses even when a number of concurrent requests are made, and in the particular case of the second system, results clearly prove that computed predictions significantly outperform those obtained if random guess was used.
  • Keywords
    Big Data; Internet; advertising data processing; business data processing; learning (artificial intelligence); purchasing; recommender systems; Hadoop ecosystem; RESTful API; Web advertising; big data real-time analysis; business performance; churning; classical machine-learning services; domain-independent big data; fast pace environment; gamers behavior; key market differentiator; on-demand tools; prediction system; purchases; real-time analytics; real-time predictions; recommender system; scalable machine learning architecture; scalable machine learning online service; valuable asset; Batch production systems; Big data; Computer architecture; Data models; Distributed databases; Google; Real-time systems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence in Big Data (CIBD), 2014 IEEE Symposium on
  • Conference_Location
    Orlando, FL
  • Type

    conf

  • DOI
    10.1109/CIBD.2014.7011537
  • Filename
    7011537