DocumentCode :
9263
Title :
CRESP: Towards Optimal Resource Provisioning for MapReduce Computing in Public Clouds
Author :
Keke Chen ; Powers, Jacob ; Shumin Guo ; Fengguang Tian
Author_Institution :
Dept. of Comput. Sci. & Eng., Wright State Univ., Dayton, OH, USA
Volume :
25
Issue :
6
fYear :
2014
fDate :
Jun-14
Firstpage :
1403
Lastpage :
1412
Abstract :
Running MapReduce programs in the cloud introduces this unique problem: how to optimize resource provisioning to minimize the monetary cost or job finish time for a specific job? We study the whole process of MapReduce processing and build up a cost function that explicitly models the relationship among the time cost, the amount of input data, the available system resources (Map and Reduce slots), and the complexity of the Reduce function for the target MapReduce job. The model parameters can be learned from test runs. Based on this cost function, we can solve a number of decision problems, such as the optimal amount of resources that can minimize monetary cost within a job finish deadline, minimize time cost under a certain monetary budget, or find the optimal tradeoffs between time and monetary costs. Experimental results show that the proposed approach performs well on a number of sample MapReduce programs in both the in-house cluster and Amazon EC2. We also conducted a variance analysis on different components of the MapReduce workflow to show the possible sources of modeling error. Our optimization results show that with the proposed approach we can save a significant amount of time and money, compared to randomly selected settings.
Keywords :
cloud computing; parallel programming; Amazon EC2; CRESP; MapReduce computing; MapReduce workflow; cost function; in-house cluster; monetary cost; optimal resource provisioning; public clouds; variance analysis; Analytical models; Cloud computing; Complexity theory; Cost function; Data models; Mathematical model; MapReduce; cloud computing; performance modeling; resource provisioning;
fLanguage :
English
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9219
Type :
jour
DOI :
10.1109/TPDS.2013.297
Filename :
6678508
Link To Document :
بازگشت