مرکز منطقه ای اطلاع رساني علوم و فناوري - Distributed MapReduce engine with fault tolerance

DocumentCode :

1789380

Title :

Distributed MapReduce engine with fault tolerance

Author :

Lixing Song ; Shaoen Wu ; Honggang Wang ; Qing Yang

Author_Institution :

Dept. of Comput. Sci., Ball State Univ., Muncie, IN, USA

fYear :

2014

fDate :

10-14 June 2014

Firstpage :

3626

Lastpage :

3630

Abstract :

Hadoop is the de facto engine that drives current cloud computing practice. Current Hadoop architecture suffers from single point of failure problems: its job management lacks of fault tolerance. If a job management fails, even if its tasks remains still active on cloud nodes, this job loses all state information and has to restart from scratch. In this work, we propose a distributed MapReduce engine for Hadoop with the Distributed Hash Table (DHT) algorithm that drives the scalable peer-to-peer networks today. The distributed Hadoop engine provides the fault-tolerance capability necessary to support efficient job computation required in the cloud computing with numerous jobs running at a moment. We have implemented the proposed distributed solution into Hadoop and evaluated its performance in job failures under various network deployments.

Keywords :

cloud computing; fault tolerant computing; peer-to-peer computing; DHT algorithm; Hadoop architecture; cloud computing; cloud nodes; distributed MapReduce engine; distributed hash table algorithm; fault tolerance; job computation; job management; peer-to-peer networks; Computer architecture; Engines; Fault tolerance; Fault tolerant systems; Peer-to-peer computing; Switches; Synchronization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Communications (ICC), 2014 IEEE International Conference on

Conference_Location :

Sydney, NSW

Type :

conf

DOI :

10.1109/ICC.2014.6883884

Filename :

6883884

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1789380