Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Michigan Univ., Ann Arbor, MI, USA
Abstract :
A system´s ability to recover quickly from transient errors is particularly important for systems that operate in hostile environments where bursts of high-frequency errors are likely. Evaluation of this ability poses a number of problems, including appropriate modeling of both the error-arrival process and the system´s workload. These two problems are addressed in the context of a specific evaluation study, where the system in question is a self-exercising, self-checking (SE/SC) memory design. Evaluation is based on a stochastic activity network model of the total system (memory, workload, and error environment) where, for comparison, both the SE/SC memory and a ´standard´ memory are modeled in this fashion. The system workload is parameterized and its effect on recovery is evaluated for different choices of parameter values. The central measure of interest is the memory´s ability to recover from bursts of transient errors. It is found that coverage is indeed workload dependent, where this dependence is particularly severe in the case of a standard memory. The results also show that, for the SE/SC design, coverage is less sensitive to workload, and in all cases, exceeds that of a standard memory.<>
Keywords :
fault location; random-access storage; stochastic processes; error recovery; error-arrival process; random access memories; stochastic activity network model; transient errors; workload effect; Built-in self-test; Computer errors; Estimation theory; Extrapolation; Hardware; Logic testing; Polynomials; Size control; Test pattern generators; Very large scale integration;