DocumentCode :
38399
Title :
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters
Author :
Shanjiang Tang ; Bu-Sung Lee ; Bingsheng He
Author_Institution :
Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
Volume :
2
Issue :
3
fYear :
2014
fDate :
July-Sept. 1 2014
Firstpage :
333
Lastpage :
347
Abstract :
MapReduce is a popular computing paradigm for large-scale data processing in cloud computing. However, the slot-based MapReduce system (e.g., Hadoop MRv1) can suffer from poor performance due to its unoptimized resource allocation. To address it, this paper identifies and optimizes the resource allocation from three key aspects. First, due to the pre-configuration of distinct map slots and reduce slots which are not fungible, slots can be severely under-utilized. Because map slots might be fully utilized while reduce slots are empty, and vice-versa. We propose an alternative technique called Dynamic Hadoop SlotAllocation by keeping the slot-based model. It relaxes the slot allocation constraint to allow slots to be reallocated to either map or reduce tasks depending on their needs. Second, the speculative execution can tackle the straggler problem, which has shown to improve the performance for a single job but at the expense of the cluster efficiency. In view of this, we propose Speculative Execution Performance Balancing to balance the performance tradeoff between a single job and a batch of jobs. Third, delay scheduling has shown to improve the data locality but at the cost of fairness. Alternatively, we propose a technique called Slot PreSchedulingthat can improve the data locality but with no impact on fairness. Finally, by combining these techniques together, we form a step-by-step slot allocation system called DynamicMR that can improve the performance of MapReduce workloads substantially. The experimental results show that our DynamicMR can improve the performance of Hadoop MRv1 significantly while maintaining the fairness, by up to 46~115 percent for single jobs and 49~112 percent for multiple jobs. Moreover, we make a comparison with YARN experimentally, showing that DynamicMR outperforms YARN by about 2~9 percent for multiple jobs due to its ratio control mechanism of running map/reduce tasks.
Keywords :
cloud computing; optimisation; resource allocation; DynamicMR; Hadoop MRv1; MapReduce clusters; cloud computing; computing paradigm; data locality; dynamic Hadoop slot allocation; dynamic slot allocation optimization framework; large-scale data processing; map slots; ratio control mechanism; slot allocation constraint; slot allocation system; slot-based MapReduce system; slot-based model; speculative execution performance balancing; unoptimized resource allocation; Cloud computing; Delays; Dynamic scheduling; Optimization; Resource management; Runtime; Yarn; Hadoop fair scheduler; MapReduce; delay scheduler; dynamicMR; slot allocation; slot preScheduling;
fLanguage :
English
Journal_Title :
Cloud Computing, IEEE Transactions on
Publisher :
ieee
ISSN :
2168-7161
Type :
jour
DOI :
10.1109/TCC.2014.2329299
Filename :
6826491
Link To Document :
بازگشت