• DocumentCode
    725408
  • Title

    BOLAS: Bipartite-Graph Oriented Locality-Aware Scheduling for MapReduce Tasks

  • Author

    Ruini Xue ; Shengli Gao ; Lixiang Ao ; Zhongyang Guan

  • Author_Institution
    Sch. of Comput. Sci. & Eng., Univ. of Electron. Sci. & Technol. of China, Chengdu, China
  • fYear
    2015
  • fDate
    June 29 2015-July 2 2015
  • Firstpage
    37
  • Lastpage
    45
  • Abstract
    Task scheduling is critical to reduce the make span of MapReduce jobs. It is an effective approach for scheduling optimization by improving the data locality, which involves attempting to locate a task and its related data block on the same node. However, recent studies have been insufficient in addressing the locality issue. This paper proposes BOLAS, a MapReducetask scheduling algorithm, which models the scheduling processes a bipartite-graph matching problem trying best to assign data block to the nearest task. By considering the divergence of node performance of distribution of data blocks in MapReduce cluster, BOLAS can achieve a high degree of data locality, guarantee minimal data transfer during execution, and reduces a job´s makespan subsequently. As a dynamic algorithm, BOLAS solves the model using Kuhn-Munkres optimal matching algorithm, and can be deployed in either homogeneous or heterogeneous environments. In this study, BOLAS was implemented as a plug in for Hadoop, and the experimental results indicate that BOLAScan localize nearly 100% of the map tasks and reduce the execution time by up to 67.1%.
  • Keywords
    distributed processing; graph theory; optimisation; scheduling; BOLAS; Hadoop; Kuhn-Munkres optimal matching algorithm; MapReduce cluster; MapReduce tasks; MapReducetask scheduling algorithm; bipartite graph matching problem; bipartite graph oriented locality aware scheduling; data block; data locality; data transfer; scheduling optimization; task scheduling; Algorithm design and analysis; Bipartite graph; Bismuth; Nickel; Scheduling; Scheduling algorithms; Data Locality; Hadoop; Kuhn- Munkres (KM) optimal-matching algorithm; MapReduce; Task Scheduling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Computing (ISPDC), 2015 14th International Symposium on
  • Conference_Location
    Limassol
  • Print_ISBN
    978-1-4673-7147-6
  • Type

    conf

  • DOI
    10.1109/ISPDC.2015.12
  • Filename
    7165129