Title :
Logic Design for Dynamic and Interactive Recovery
Author :
Carter, W.C. ; Jessep, D.C. ; Wadia, Aspi B. ; Schneider, Peter R. ; Bouricius, Willard G.
Author_Institution :
IEEE
Abstract :
Recovery in a fault-tolerant computer means the continuation of system operation with data integrity after an error occurs. This paper delineates two parallel concepts embodied in the hardware and software functions required for recovery; detection, diagnosis, and reconfiguration for hardware, data integrity, checkpointing, and restart for the software. The hardware relies on the recovery variable set, checking circuits, and diagnostics, and the software relies on the recovery information set, audit, and reconstruct routines, to characterize the system state and assist in recovery when required. Of particular utility is a handware unit, the recovery control unit, which serves as an interface between error detection and software recovery programs in the supervisor and provides dynamic interactive recovery.
Keywords :
Checkpoint and retry, computer diagnostics and audit programs, computer reconfiguration, computer repair, fault masking, information integrity, interactive recovery, interrupts, logout.; Built-in self-test; Checkpointing; Circuits; Computer errors; Contamination; Error correction; Fault tolerant systems; Hardware; Logic design; Process design; Checkpoint and retry, computer diagnostics and audit programs, computer reconfiguration, computer repair, fault masking, information integrity, interactive recovery, interrupts, logout.;
Journal_Title :
Computers, IEEE Transactions on
DOI :
10.1109/T-C.1971.223131