DocumentCode :
2029024
Title :
NEWT - A Fault Tolerant BSP Framework on Hadoop YARN
Author :
Kromonov, Ilja ; Jakovits, P. ; Srirama, Satish Narayana
Author_Institution :
Inst. of Comput. Sci., Univ. of Tartu, Tartu, Estonia
fYear :
2013
fDate :
9-12 Dec. 2013
Firstpage :
309
Lastpage :
310
Abstract :
The importance of fault tolerance for the parallel computing field is ever increasing, as the mean time between failures is predicted to decrease significantly for future highly parallel systems. The current trend of using commodity hardware to reduce the cost of clusters forces users to ensure that their applications are fault tolerant. When it comes to embarrassingly parallel data-intensive algorithms, MapReduce has gone a long way in simplifying the creation of such applications. However, this does not apply to iterative communication-intensive algorithms common in the scientific computing domain. In this work we propose a new programming model inspired by Bulk Synchronous Parallel (BSP) for creating new a fault tolerant distributed computing framework. We strive to retain the advantages that MapReduce provides, yet efficiently support a larger assortment of algorithms, such as the aforementioned iterative ones.
Keywords :
fault tolerant computing; parallel algorithms; parallel programming; Hadoop YARN; NEWT; bulk synchronous parallel framework; cluster cost reduction; commodity hardware; fault tolerant BSP framework; fault tolerant distributed computing framework; parallel computing; parallel data-intensive algorithms; parallel system; programming model; Adaptation models; Clustering algorithms; Computational modeling; Fault tolerance; Fault tolerant systems; Prototypes; Yarn; BSP; Distributed computing; MPI; MapReduce; fault tolerance; parallel computing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Utility and Cloud Computing (UCC), 2013 IEEE/ACM 6th International Conference on
Conference_Location :
Dresden
Type :
conf
DOI :
10.1109/UCC.2013.66
Filename :
6809422
Link To Document :
بازگشت