• DocumentCode
    3199392
  • Title

    SMapReduce: Optimising Resource Allocation by Managing Working Slots at Runtime

  • Author

    Feng Liang ; Lau, Francis C. M.

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Hong Kong, Hong Kong, China
  • fYear
    2015
  • fDate
    25-29 May 2015
  • Firstpage
    281
  • Lastpage
    290
  • Abstract
    Hadoop version 1 (HadoopV1) and version 2 (YARN) manage the resources in a distributed system in different ways. HadoopV1 executes MapReduce tasks in working slots that are statically configured, YARN uses a set of task containers to encapsulate its memory and CPU resources. However, neither of them considers the runtime performance of the cluster when deciding the proper number of concurrent tasks to run on each node to achieve the optimal throughput. In order to gain higher performance, the users of Hadoop usually need to use their experience to carefully configure the resources of the cluster and the resources needed by their jobs. But as the workload is typically always changing in the cluster, rarely could such a manual configuration lead to optimized performance. In this paper, we study the MapReduce job performance in HadoopV1 and YARN with different resource configurations, and model the cluster throughput in terms of the resource capacity of the cluster. We propose SMapReduce, which can dynamically manage a proper number of concurrent tasks running on each node. SMapReduce can gain the maximum job throughput by considering the thrashing phenomenon and the balancing between map and reduce tasks. Evaluation results show that SMapReduce can yield significant performance speedup comparing to both HadoopV1 and YARN for various MapReduce workloads.
  • Keywords
    data handling; parallel processing; resource allocation; Hadoop version 1; Hadoop version 2; HadoopV1; SMapReduce; YARN; cluster resource capacity; resource allocation; thrashing phenomenon; Containers; Heart beat; Resource management; Runtime; Synchronization; Throughput; Yarn; Hadoop; MapReduce; Performance Modelling; Resource Management;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International
  • Conference_Location
    Hyderabad
  • ISSN
    1530-2075
  • Type

    conf

  • DOI
    10.1109/IPDPS.2015.17
  • Filename
    7161517