Title :
Design of new roll-forward recovery approach for distributed systems
Author :
Gupta, B. ; Banerjee, S.K. ; Liu, B.
Author_Institution :
Dept. of Comput. Sci., Southern Illinois Univ., Carbondale, IL, USA
fDate :
5/1/2002 12:00:00 AM
Abstract :
A new roll-forward checkpointing scheme is proposed using basic checkpoints. The direct-dependency concept used in the communication-induced checkpointing scheme is applied to basic checkpoints to design a simple algorithm to find a consistent global checkpoint. Both blocking (i.e. when the application processes are suspended during the execution of the algorithm) and non-blocking approaches are presented. The use of the concept of forced checkpoints ensures a small re-execution time after recovery from a failure. The proposed approaches enjoy the main advantages of both the synchronous and the asynchronous approaches, i.e. simple recovery and simple way to create checkpoints. Besides, in the proposed blocking approach, the direct-dependency concept is implemented without piggybacking any extra information with the application message. A very simple scheme for avoiding the creation of useless checkpoints is also proposed
Keywords :
system recovery; checkpointing scheme; communication-induced checkpointing scheme; distributed systems; roll-forward recovery approach;
Journal_Title :
Computers and Digital Techniques, IEE Proceedings -
DOI :
10.1049/ip-cdt:20020410