Title :
Selective recovery in distributed systems
Author :
Neogy, Sarmistha ; Sinha, Ampam ; Das, Pradip K.
Author_Institution :
Dept. of Comput. Sci. & Eng., Jadavpur Univ., Calcutta, India
Abstract :
Selective recovery of processes during rollback is a well known technique for avoiding unnecessary rollbacks of a large number of processes after a failure is detected in a distributed system. The technique that selects a minimum number of processes for recovery in the present work is simple and though the selection is on-line, it does not add any overhead. A process may however, be directly or even indirectly dependent on the faulty process and hence needs to recover. The initial tracking of dependence of a process on the faulty process is done by checking local variables. For maintaining consistency a process may have to rollback even if it is not directly dependent on the faulty process.
Keywords :
checkpointing; fault tolerant computing; protocols; distributed system; failure detection; local variable; on-line selection; selective recovery;
Conference_Titel :
TENCON 2004. 2004 IEEE Region 10 Conference
Print_ISBN :
0-7803-8560-8
DOI :
10.1109/TENCON.2004.1414531