• DocumentCode
    287632
  • Title

    Efficient and fault-tolerant checkpointing procedures for distributed systems

  • Author

    Saleh, Kassem ; Agarwal, Anjali

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Kuwait Univ., Kuwait
  • fYear
    1993
  • fDate
    23-26 Mar 1993
  • Firstpage
    161
  • Lastpage
    167
  • Abstract
    Problems related to distributed systems fault-tolerance are tackled by providing efficient and fault-tolerant algorithm procedures for checkpointing and rollback recovery for such systems. The authors propose checkpointing algorithms which can be initiated by any process in the system or upon failure of one or more component processes as part of a backward recovery procedure. The algorithm return the most recent and consistent checkpoints, require less stable storage and do not interfere with the progress of the distributed system application. Obtaining a consistent checkpoint is always guaranteed. Examples illustrating these algorithms are also provided
  • Keywords
    distributed databases; fault tolerant computing; backward recovery procedure; distributed systems; fault-tolerant algorithm procedures; fault-tolerant checkpointing procedures; rollback recovery; Checkpointing; Delay; Distributed algorithms; Distributed computing; Fault tolerant systems; Joining processes; Law; Legal factors; Resumes; System recovery;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computers and Communications, 1993., Twelfth Annual International Phoenix Conference on
  • Conference_Location
    Tempe, AZ
  • Print_ISBN
    0-7803-0922-7
  • Type

    conf

  • DOI
    10.1109/PCCC.1993.344469
  • Filename
    344469