• DocumentCode
    2990212
  • Title

    Effective sender-based message logging algorithm with checkpointing considering transient communication errors

  • Author

    Ahn, Jinho

  • Author_Institution
    Dept. of Comput. Sci., Kyonggi Univ., Suwon, South Korea
  • fYear
    2011
  • fDate
    4-8 July 2011
  • Firstpage
    330
  • Lastpage
    335
  • Abstract
    Thanks to its beneficial features such as requiring no specialized hardware and lowering highly failure-free overhead of synchronous logging with volatile logging at sender´s memory, sender-based message logging (SBML) with checkpointing might be applied into many distributed systems as a low-cost transparent rollback recovery technique. However, the original SBML recovery algorithm may no longer be progressing in some transient communication error cases. This paper proposes a consistent recovery algorithm to solve this problem by piggybacking small log information for unstable messages received on each acknowledgement message for returning the receive sequence number assigned to a message by its receiver. This feature also allows message send operations delayed after having performed some message receive operations during failure-free execution to begin executing much earlier compared with the existing ones.
  • Keywords
    checkpointing; message passing; checkpointing; consistent recovery algorithm; distributed system; failure-free execution; log information piggybacking; sender-based message logging recovery algorithm; synchronous logging; transient communication errors; transparent rollback recovery technique; volatile logging; Checkpointing; Computational modeling; Delay; Fault tolerance; Memory management; Receivers; Transient analysis; checkpointing; consistent recovery; distributed systems; fault-tolerance; message logging; scalability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing and Simulation (HPCS), 2011 International Conference on
  • Conference_Location
    Istanbul
  • Print_ISBN
    978-1-61284-380-3
  • Type

    conf

  • DOI
    10.1109/HPCSim.2011.5999842
  • Filename
    5999842