DocumentCode :
725330
Title :
Fault-Tolerant and Elastic Streaming MapReduce with Decentralized Coordination
Author :
Kumbhare, Alok ; Frincu, Marc ; Simmhan, Yogesh ; Prasanna, Viktor K.
Author_Institution :
Univ. of Southern California, Los Angeles, CA, USA
fYear :
2015
fDate :
June 29 2015-July 2 2015
Firstpage :
328
Lastpage :
338
Abstract :
The MapReduce programming model, due to its simplicity and scalability, has become an essential tool for processing large data volumes in distributed environments. Recent Stream Processing Systems (SPS) this model to provide low-latency analysis of high-velocity continuous data streams. However, integrating MapReduce with streaming poses challenges: first, the runtime variations in data characteristics such as data-rates and key-distribution cause resource overload, that in-turn leads to fluctuations in the Quality of the Service (QoS), and second, the stateful reducers, whose state depends on the complete tuple history, necessitates efficient fault-recovery mechanisms to maintain the desired QoS in the presence of resource failures. We propose an integrated streaming MapReduce architecture leveraging the concept of consistent hashing to support runtime elasticity along with locality-aware data and state replication to provide efficient load-balancing with low-overhead fault-tolerance and parallel fault-recovery from multiple simultaneous failures. Our evaluation on a private cloud shows up to 2.8× improvement in peak throughput compared to Apache Storm SPS, and a low recovery latency of 700 - 1500 ms from multiple failures.
Keywords :
data handling; parallel processing; quality of service; resource allocation; software fault tolerance; Apache Storm SPS; MapReduce programming model; QoS; data-rates; decentralized coordination; elastic streaming; fault-recovery mechanisms; high-velocity continuous data streams; integrated streaming architecture; key-distribution; load-balancing; locality-aware data; low-latency analysis; low-overhead fault-tolerance; parallel fault-recovery; private cloud; quality of the service; resource failures; runtime elasticity; runtime variations; state replication; stream processing systems; tuple history; Checkpointing; Elasticity; Fault tolerance; Fault tolerant systems; Peer-to-peer computing; Runtime; Storms; Distributed Stream processing; Streaming mapreduce; big data; fault-tolerance; runtime elasticity;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Distributed Computing Systems (ICDCS), 2015 IEEE 35th International Conference on
Conference_Location :
Columbus, OH
ISSN :
1063-6927
Type :
conf
DOI :
10.1109/ICDCS.2015.41
Filename :
7164919
Link To Document :
بازگشت