• DocumentCode
    2959898
  • Title

    Scalable Distributed Consensus to Support MPI Fault Tolerance

  • Author

    Buntinas, Darius

  • Author_Institution
    Argonne Nat. Lab., Argonne, IL, USA
  • fYear
    2012
  • fDate
    21-25 May 2012
  • Firstpage
    1240
  • Lastpage
    1249
  • Abstract
    As system sizes increase, the amount of time in which an application can run without experiencing a failure decreases. Exascale applications will need to address fault tolerance. In order to support algorithm-based fault tolerance, communication libraries will need to provide fault-tolerance features to the application. One important fault-tolerance operation is distributed consensus. This is used, for example, to collectively decide on a set of failed processes. This paper describes a scalable, distributed consensus algorithm that is used to support new MPI fault-tolerance features proposed by the MPI 3 Forum´s fault-tolerance working group. The algorithm was implemented and evaluated on a 4,096-core Blue Gene/P. The implementation was able to perform a full-scale distributed consensus in 222 μs and scaled logarithmically.
  • Keywords
    application program interfaces; fault tolerant computing; message passing; 4096-core Blue Gene-P; MPI3 Forum fault-tolerance working group; algorithm-based fault tolerance; communication libraries; exascale applications; scalable distributed consensus algorithm; Checkpointing; Detectors; Fault tolerance; Fault tolerant systems; Libraries; Proposals; Semantics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International
  • Conference_Location
    Shanghai
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4673-0975-2
  • Type

    conf

  • DOI
    10.1109/IPDPS.2012.113
  • Filename
    6267926