• DocumentCode
    168657
  • Title

    PLAStiCC: Predictive Look-Ahead Scheduling for Continuous Dataflows on Clouds

  • Author

    Kumbhare, Alok Gautam ; Simmhan, Yogesh ; Prasanna, Viktor K.

  • Author_Institution
    Univ. of Southern California, Los Angeles, CA, USA
  • fYear
    2014
  • fDate
    26-29 May 2014
  • Firstpage
    344
  • Lastpage
    353
  • Abstract
    Scalable stream processing and continuous dataflow systems are gaining traction with the rise of big data due to the need for processing high velocity data in near real time. Unlike batch processing systems such as MapReduce and workflows, static scheduling strategies fall short for continuous data flows due to the variations in the input data rates and the need for sustained throughput. The elastic resource provisioning of cloud infrastructure is valuable to meet the changing resource needs of such continuous applications. However, multi-tenant cloud resources introduce yet another dimension of performance variability that impacts the application´s throughput. In this paper we propose Plastic, an adaptive scheduling algorithm that balances resource cost and application throughput using a prediction-based look-ahead approach. It not only addresses variations in the input data rates but also the underlying cloud infrastructure. In addition, we also propose several simpler static scheduling heuristics that operate in the absence of accurate performance prediction model. These static and adaptive heuristics are evaluated through extensive simulations using performance traces obtained from Amazon AWS IaaS public cloud. Our results show an improvement of up to 20% in the overall profit as compared to the reactive adaptation algorithm.
  • Keywords
    cloud computing; data flow analysis; profitability; scheduling; software performance evaluation; Amazon AWS IaaS public cloud; MapReduce; PLAStiCC; adaptive scheduling algorithm; application throughput; batch processing systems; cloud infrastructure; continuous dataflow systems; elastic resource provisioning; high velocity data; multitenant cloud resources; performance variability; predictive look-ahead scheduling; profit; reactive adaptation algorithm; resource cost balances; scalable stream processing; static scheduling strategies; Cloud computing; Dynamic scheduling; Optimization; Predictive models; Quality of service; Runtime; Throughput; Continuous Dataflows; Elastic resource management; IaaS Clouds; Predictive scheduling; Stream processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster, Cloud and Grid Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on
  • Conference_Location
    Chicago, IL
  • Type

    conf

  • DOI
    10.1109/CCGrid.2014.60
  • Filename
    6846470