• DocumentCode
    604081
  • Title

    Skew-Aware Task Scheduling in Clouds

  • Author

    Dongsheng Li ; Yixing Chen ; Hai, R.H.

  • Author_Institution
    Nat. Lab. for Parallel & Distrib. Process., Nat. Univ. of Defense Technol., Changsha, China
  • fYear
    2013
  • fDate
    25-28 March 2013
  • Firstpage
    341
  • Lastpage
    346
  • Abstract
    Data skew is an important reason for the emergence of stragglers in MapReduce-like cloud systems. In this paper, we propose a Skew-Aware Task Scheduling (SATS) mechanism for iterative applications in MapReduce-like systems. The mechanism utilizes the similarity of data distribution in adjacent iterations of iterative applications to reduce the straggle problem caused by data skew. It collects the data distribution information during the execution of tasks for the current iteration, and uses the information to guide data partitioning in tasks for the next iteration. We implement the mechanism in the HaLoop system and deploy it in a cluster. Experiments show that the proposed mechanism could deal with the data skew and improve the load balancing effectively.
  • Keywords
    cloud computing; iterative methods; resource allocation; scheduling; task analysis; HaLoop system; MapReduce-like cloud systems; MapReduce-like systems; SATS mechanism; data distribution information; data partitioning; data skew; iterative applications; load balancing; skew-aware task scheduling; Computational modeling; Data models; Data structures; Distributed databases; File systems; Load management; Processor scheduling; Cloud; Data Skew; Load balancing; Task Scheduling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Service Oriented System Engineering (SOSE), 2013 IEEE 7th International Symposium on
  • Conference_Location
    Redwood City
  • Print_ISBN
    978-1-4673-5659-6
  • Type

    conf

  • DOI
    10.1109/SOSE.2013.64
  • Filename
    6525543