DocumentCode
3321708
Title
Optimistic recovery in multi-threaded distributed systems
Author
Damani, Om P. ; Tarafdar, Ashis ; Garg, Vijay K.
Author_Institution
Dept. of Comput. Sci., Texas Univ., Austin, TX, USA
fYear
1999
fDate
1999
Firstpage
234
Lastpage
243
Abstract
The problem of recovering distributed systems from crash failures has been widely studied in the context of traditional non-threaded processes. However, extending those solutions to the multi-threaded scenario presents new problems. We identify and address these problems for optimistic logging protocols. There are two natural extension to optimistic logging protocols in the multi-threaded scenario. The first extension is process-centric, where the points of internal non-determinism caused by threads are logged. The second extension is thread-centric, where each thread is treated as a separate process. The process-centric approach suffers from false causality while the thread-centric approach suffers from high causality tracking overhead. By observing that the granularity of failures can be different from the granularity of rollbacks, we design a new balanced approach which incurs low causality tracking overhead and also eliminates false causality
Keywords
multi-threading; system recovery; crash failures; distributed systems; multi-threaded; optimistic logging protocols; process-centric; recovering distributed systems; thread-centric; Checkpointing; Computer crashes; Concurrent computing; Electronic switching systems; Protocols; Read only memory; Yarn;
fLanguage
English
Publisher
ieee
Conference_Titel
Reliable Distributed Systems, 1999. Proceedings of the 18th IEEE Symposium on
Conference_Location
Lausanne
ISSN
1060-9857
Print_ISBN
0-7695-0290-3
Type
conf
DOI
10.1109/RELDIS.1999.805099
Filename
805099
Link To Document