• DocumentCode
    3063047
  • Title

    Byzantine Fault-Tolerant MapReduce: Faults are Not Just Crashes

  • Author

    Costa, Pedro ; Pasin, Marcelo ; Bessani, Alysson N. ; Correia, Miguel

  • Author_Institution
    Fac. de Cienc., Univ. de Lisboa, Lisbon, Portugal
  • fYear
    2011
  • fDate
    Nov. 29 2011-Dec. 1 2011
  • Firstpage
    32
  • Lastpage
    39
  • Abstract
    MapReduce is often used to run critical jobs such as scientific data analysis. However, evidence in the literature shows that arbitrary faults do occur and can probably corrupt the results of MapReduce jobs. MapReduce runtimes like Hadoop tolerate crash faults, but not arbitrary or Byzantine faults. We present a MapReduce algorithm and prototype that tolerate these faults. An experimental evaluation shows that the execution of a job with our algorithms uses twice the resources of the original Hadoop, instead of the 3 or 4 times more that would be achieved with the direct application of common Byzantine fault-tolerance paradigms. We believe this cost is acceptable for critical applications that require that level of fault tolerance.
  • Keywords
    parallel processing; search engines; software fault tolerance; Byzantine fault-tolerant MapReduce; MapReduce job; original Hadoop; scientific data analysis; Computer crashes; Fault tolerance; Fault tolerant systems; Google; Heart beat; Runtime; Servers; Byzantine Fault-Tolerance; Hadoop MapReduce; arbitrary faults;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cloud Computing Technology and Science (CloudCom), 2011 IEEE Third International Conference on
  • Conference_Location
    Athens
  • Print_ISBN
    978-1-4673-0090-2
  • Type

    conf

  • DOI
    10.1109/CloudCom.2011.15
  • Filename
    6133124