DocumentCode
3063047
Title
Byzantine Fault-Tolerant MapReduce: Faults are Not Just Crashes
Author
Costa, Pedro ; Pasin, Marcelo ; Bessani, Alysson N. ; Correia, Miguel
Author_Institution
Fac. de Cienc., Univ. de Lisboa, Lisbon, Portugal
fYear
2011
fDate
Nov. 29 2011-Dec. 1 2011
Firstpage
32
Lastpage
39
Abstract
MapReduce is often used to run critical jobs such as scientific data analysis. However, evidence in the literature shows that arbitrary faults do occur and can probably corrupt the results of MapReduce jobs. MapReduce runtimes like Hadoop tolerate crash faults, but not arbitrary or Byzantine faults. We present a MapReduce algorithm and prototype that tolerate these faults. An experimental evaluation shows that the execution of a job with our algorithms uses twice the resources of the original Hadoop, instead of the 3 or 4 times more that would be achieved with the direct application of common Byzantine fault-tolerance paradigms. We believe this cost is acceptable for critical applications that require that level of fault tolerance.
Keywords
parallel processing; search engines; software fault tolerance; Byzantine fault-tolerant MapReduce; MapReduce job; original Hadoop; scientific data analysis; Computer crashes; Fault tolerance; Fault tolerant systems; Google; Heart beat; Runtime; Servers; Byzantine Fault-Tolerance; Hadoop MapReduce; arbitrary faults;
fLanguage
English
Publisher
ieee
Conference_Titel
Cloud Computing Technology and Science (CloudCom), 2011 IEEE Third International Conference on
Conference_Location
Athens
Print_ISBN
978-1-4673-0090-2
Type
conf
DOI
10.1109/CloudCom.2011.15
Filename
6133124
Link To Document