Title :
Concurrent error detection using watchdog processors-a survey
Author :
Mahmood, Aamer ; McCluskey, E.J.
Author_Institution :
Comput. Syst. Lab., Stanford Univ., CA, USA
Abstract :
Concurrent system-level error detection techniques using a watchdog processor are surveyed. A watchdog processor is a small and simple coprocessor that detects errors by monitoring the behavior of a system. Like replication, it does not depend on any fault model for error detection. However, it requires less hardware than replication. It is shown that a large number of errors can be detected by monitoring the control flow and memory-access behavior. Two techniques for control-flow checking are discussed and compared with current error-detection techniques. A scheme for memory-access checking based on capability-based addressing is described. The design of a watchdog for performing reasonable checks on the output of a main processor by executing assertions is discussed.
Keywords :
automatic testing; computer architecture; computer testing; computerised monitoring; error detection; fault tolerant computing; reviews; satellite computers; special purpose computers; assertion execution; capability-based addressing; concurrent error detection; control-flow checking; coprocessor; fault tolerant computing; memory-access checking; reasonable checks; system behaviour monitoring; system-level error detection; watchdog processors; Circuit faults; Circuit testing; Computer errors; Coprocessors; Error correction; Error correction codes; Fault detection; Laboratories; Monitoring; Phase detection;
Journal_Title :
Computers, IEEE Transactions on