• DocumentCode
    704083
  • Title

    Big-data streaming applications scheduling with online learning and concept drift detection

  • Author

    Kanoun, Karim ; Van der Schaar, Mihaela

  • Author_Institution
    Embedded Syst. Lab. (ESL), EPFL, Lausanne, Switzerland
  • fYear
    2015
  • fDate
    9-13 March 2015
  • Firstpage
    1547
  • Lastpage
    1550
  • Abstract
    Several techniques have been proposed to adapt Big-Data streaming applications to resource constraints. These techniques are mostly implemented at the application layer and make simplistic assumptions about the system resources and they are often agnostic to the system capabilities. Moreover, they often assume that the data streams characteristics and their processing needs are stationary, which is not true in practice. In fact, data streams are highly dynamic and may also experience concept drift, thereby requiring continuous online adaptation of the throughput and quality to each processing task. Hence, existing solutions for Big-Data streaming applications are often too conservative or too aggressive. To address these limitations, we propose an online energy-efficient scheduler which maximizes the QoS (i.e., throughput and output quality) of Big-Data streaming applications under energy and resources constraints. Our scheduler uses online adaptive reinforcement learning techniques and requires no offline information. Moreover, our scheduler is able to detect concept drifts and to smoothly adapt the scheduling strategy. Our experiments realized on a chain of tasks modeling real-life streaming application demonstrate that our scheduler is able to learn the scheduling policy and to adapt it such that it maximizes the targeted QoS given energy constraint as the Big-Data characteristics are dynamically changing.
  • Keywords
    Big Data; learning (artificial intelligence); quality of service; scheduling; QoS; application layer; big-data streaming applications scheduling; concept drift; concept drift detection; continuous online adaptation; data streams characteristics; online adaptive reinforcement learning techniques; online energy-efficient scheduler; resource constraints; scheduling policy; Data mining; Dynamic scheduling; Heuristic algorithms; Learning (artificial intelligence); Quality of service; Throughput;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015
  • Conference_Location
    Grenoble
  • Print_ISBN
    978-3-9815-3704-8
  • Type

    conf

  • Filename
    7092635