Title :
Dynamic reconfiguration of CSP programs for fault tolerance
Author_Institution :
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Kanpur, India
Abstract :
In a distributed computation being performed by a network of communicating processes, failure of a process due to the failure of its host node can cause the entire computation to be aborted. The author proposes a scheme to make a distributed program resilient to the failure of one of its constituent processes. The distributed computation is completed despite the failure of a process. The scheme is for CSP programs and allows nondeterminism within a process. In CSP, the process name is used in input/output commands. Since synchronous communication is used, if a process specified in the input/output command of a process P does not execute a matching output/input command, P might get blocked. In the proposed scheme, if a process fails, another process starts executing on a backup node from the last checkpoint (CP) of the failed process. Programmed exception handling is used to ensure proper recovery and fault tolerance.<>
Keywords :
communicating sequential processes; exception handling; fault tolerant computing; system recovery; CSP programs; backup node; communicating sequential processes; distributed computation; distributed program; dynamic reconfiguration; fault tolerance; input/output commands; nondeterminism; programmed exception handling; synchronous communication; Broadcasting; Checkpointing; Computer networks; Computer science; Distributed computing; Fault tolerance; Impedance matching; Message passing; Postal services;
Conference_Titel :
Fault-Tolerant Computing, 1992. FTCS-22. Digest of Papers., Twenty-Second International Symposium on
Conference_Location :
Boston, MA, USA
Print_ISBN :
0-8186-2875-8
DOI :
10.1109/FTCS.1992.243616