DocumentCode :
3172223
Title :
Grey-Box Approach for Performance Prediction in Map-Reduce Based Platforms
Author :
Kadirvel, Selvi ; Fortes, José A B
Author_Institution :
Adv. Comput. & Inf. Syst. Lab., Univ. of Florida, Gainesville, FL, USA
fYear :
2012
fDate :
July 30 2012-Aug. 2 2012
Firstpage :
1
Lastpage :
9
Abstract :
Map-Reduce has become an important paradigm for data-intensive computations. The ability to estimate Map-reduce application performance is critical for efficient resource scheduling and provisioning both on dedicated clusters and on the cloud. Current state-of-the-art techniques for performance prediction of Map- Reduce applications use analytical and simulation-based models. In this paper, we make the case for performance prediction using regression techniques based on machine-learning. Through modeling the Map-Reduce environment as a grey-box, we can leverage a combination of externally observed system features and information about sub-system internals. We identify four learning techniques with high prediction accuracy through a detailed comparative study of twenty methods. The powerful capabilities of data analytics platforms are usually accompanied by frequent faults that occur due to scale, complexity and the use of commercial off- the-shelf components. We show that our proposed approach can effectively predict degraded performance under these faulty conditions by the inclusion of additional fault-related input features. A mean prediction error of <;12% was achieved across the range of parameters studied on a 64-node Xen virtualized environment running an open-source Map-Reduce implementation, Hadoop.
Keywords :
cloud computing; data analysis; grey systems; learning (artificial intelligence); scheduling; virtual reality; Hadoop; Xen virtualized environment; cloud; commercial off-the-shelf components; data analytics; data-intensive computations; dedicated clusters; grey-box approach; machine learning; map-reduce based platforms; performance prediction; resource scheduling; Computational modeling; Fault tolerance; Fault tolerant systems; Mathematical model; Middleware; Predictive models; Virtual machining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Communications and Networks (ICCCN), 2012 21st International Conference on
Conference_Location :
Munich
Print_ISBN :
978-1-4673-1543-2
Type :
conf
DOI :
10.1109/ICCCN.2012.6289311
Filename :
6289311
Link To Document :
بازگشت