• DocumentCode
    3321708
  • Title

    Optimistic recovery in multi-threaded distributed systems

  • Author

    Damani, Om P. ; Tarafdar, Ashis ; Garg, Vijay K.

  • Author_Institution
    Dept. of Comput. Sci., Texas Univ., Austin, TX, USA
  • fYear
    1999
  • fDate
    1999
  • Firstpage
    234
  • Lastpage
    243
  • Abstract
    The problem of recovering distributed systems from crash failures has been widely studied in the context of traditional non-threaded processes. However, extending those solutions to the multi-threaded scenario presents new problems. We identify and address these problems for optimistic logging protocols. There are two natural extension to optimistic logging protocols in the multi-threaded scenario. The first extension is process-centric, where the points of internal non-determinism caused by threads are logged. The second extension is thread-centric, where each thread is treated as a separate process. The process-centric approach suffers from false causality while the thread-centric approach suffers from high causality tracking overhead. By observing that the granularity of failures can be different from the granularity of rollbacks, we design a new balanced approach which incurs low causality tracking overhead and also eliminates false causality
  • Keywords
    multi-threading; system recovery; crash failures; distributed systems; multi-threaded; optimistic logging protocols; process-centric; recovering distributed systems; thread-centric; Checkpointing; Computer crashes; Concurrent computing; Electronic switching systems; Protocols; Read only memory; Yarn;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Reliable Distributed Systems, 1999. Proceedings of the 18th IEEE Symposium on
  • Conference_Location
    Lausanne
  • ISSN
    1060-9857
  • Print_ISBN
    0-7695-0290-3
  • Type

    conf

  • DOI
    10.1109/RELDIS.1999.805099
  • Filename
    805099