DocumentCode :
3063108
Title :
Optimizing Multiple Machine Learning Jobs on MapReduce
Author :
Tamano, Hiroshi ; Nakadai, Shinji ; Araki, Takuya
Author_Institution :
Service Platforms Res. Labs., NEC Corp., Kawasaki, Japan
fYear :
2011
fDate :
Nov. 29 2011-Dec. 1 2011
Firstpage :
59
Lastpage :
66
Abstract :
Recently, MapReduce has been used to parallelize machine learning algorithms. To obtain the best performance for these algorithms, tuning the parameters of the algorithms is required. However, this is time consuming because it requires executing a MapReduce program multiple times using various parameters. Such multiple executions can be assigned to a cluster in various ways, and the execution time varies depending on the assignments. To achieve the shortest execution time, we propose a method for optimizing the assignment of MapReduce jobs to a cluster assuming machine learning targeted runtime. We developed an execution cost model to predict the total execution time of jobs and obtained the optimal assignment by minimizing the cost model. To evaluate the proposed method, we implemented an experimental MapReduce runtime based on Message Passing Interface and executed logistic regression in four cases. The results showed that the proposed method can correctly predict the optimal job assignment. We also confirmed that the optimal assignment reduced execution time by a maximum 77% compared to the worst assignment.
Keywords :
learning (artificial intelligence); message passing; parallel algorithms; MapReduce job assignment; MapReduce program; cost model minimization; executed logistic regression; job total execution time prediction; machine learning targeted runtime; message passing interface; multiple machine learning jobs; parallelize machine learning algorithms; Computational modeling; Distributed databases; Machine learning; Machine learning algorithms; Optimization; Runtime; Vectors; Job Scheduling; Machine Learning; MapReduce;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cloud Computing Technology and Science (CloudCom), 2011 IEEE Third International Conference on
Conference_Location :
Athens
Print_ISBN :
978-1-4673-0090-2
Type :
conf
DOI :
10.1109/CloudCom.2011.18
Filename :
6133127
Link To Document :
بازگشت