Title :
On Datacenter-Network-Aware Load Balancing in MapReduce
Author :
Yanfang Le ; Feng Wang ; Jiangchuan Liu ; Ergun, Funda
Author_Institution :
Simon Fraser Univ., Burnaby, BC, Canada
Abstract :
MapReduce has emerged as a powerful tool for distributed and scalable processing of voluminous data. For skewed data input, load balancing is necessary among the MapReduce worker nodes to minimize the overall finishing time, which however can incur massive data movement in a data center network. In this paper, we for the first time examine this problem of data center-network-aware load balancing in the shuffle sub phase in MapReduce. Different from earlier studies that generally assume the network inside a data center has negligible delay and infinite capacity, we consider the traffic and bottlenecks in real data center networks by introducing the constraints on available network bandwidth, and demonstrate that the corresponding problem can be decomposed into two sub problems for network flow and load balancing, respectively. We show effective solutions to both of them, which together yield a complete solution towards near optimal data center-network-aware load balancing. A much simpler yet performance-wise comparable greedy algorithm is also developed for fast implementation in practice. The effectiveness of our solution has been demonstrated on synthetic and real public datasets.
Keywords :
computer centres; data handling; greedy algorithms; parallel processing; resource allocation; MapReduce; datacenter network; distributed processing; greedy algorithm; network bandwidth; optimal datacenter-network-aware load balancing; scalable processing; shuffle subphase; skewed data input; Algorithm design and analysis; Bandwidth; Clustering algorithms; Linear programming; Load management; Network topology; Partitioning algorithms; Datacenter Network; Load Balancing; MapReduce;
Conference_Titel :
Cloud Computing (CLOUD), 2015 IEEE 8th International Conference on
Conference_Location :
New York City, NY
Print_ISBN :
978-1-4673-7286-2
DOI :
10.1109/CLOUD.2015.71