DocumentCode :
3319135
Title :
SAND Join — A skew handling join algorithm for Google´s MapReduce framework
Author :
Atta, Fariha ; Viglas, Stratis D. ; Niazi, Salman
Author_Institution :
Nat. Univ. of Comput. & Emerging Sci., Pakistan
fYear :
2011
fDate :
22-24 Dec. 2011
Firstpage :
170
Lastpage :
175
Abstract :
The simplicity and flexibility of the MapReduce framework have motivated programmers of large scale distributed data processing applications to develop their applications using this framework. However, the implementations of this framework, including Hadoop, do not handle skew in the input data effectively. Skew in the input data results in poor load balancing which can swamp the benefits achievable by parallelization of applications on such parallel processing frameworks. The performance of join operation, which is the most expensive and most frequently executed operation, is severely degraded in the presence of heavy skew in the input datasets to be joined. Hadoop´s implementation of the join operation cannot effectively handle such skewed joins, attributed to the use of hash partitioning for load distribution. In this work, we introduce “Skew hANDling Join” (SAND Join) that employs range partitioning instead of hash partitioning for load distribution. Experiments show that SAND Join algorithm can efficiently perform joins on the datasets that are sufficiently skewed. We also compare the performance of this algorithm with that of Hadoop´s join algorithms.
Keywords :
parallel processing; resource allocation; Google MapReduce framework; Hadoop join algorithm; SAND Join algorithm; distributed data processing application; join operation; load balancing; load distribution; parallel processing framework; parallelization; range partitioning; skew handling join; skew handling join algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multitopic Conference (INMIC), 2011 IEEE 14th International
Conference_Location :
Karachi
Print_ISBN :
978-1-4577-0654-7
Type :
conf
DOI :
10.1109/INMIC.2011.6151466
Filename :
6151466
Link To Document :
بازگشت