Title :
Perturbation-based Fault Screening
Author :
Racunas, Paul ; Constantinides, Kypros ; Manne, Srilatha ; Mukherjee, Shubhendu S.
Author_Institution :
FACT Group, Intel Corp., Hudson, MA
Abstract :
Fault screeners are a new breed of fault identification technique that can probabilistically detect if a transient fault has affected the state of a processor. We demonstrate that fault screeners function because of two key characteristics. First, we show that much of the intermediate data generated by a program inherently falls within certain consistent bounds. Second, we observe that these bounds are often violated by the introduction of a fault. Thus, fault screeners can identify faults by directly watching for any data inconsistencies arising in an application´s behavior. We present an idealized algorithm capable of identifying over 85% of injected faults on the SpecInt suite and over 75% overall. Further, in a realistic implementation on a simulated Pentium-III-like processor, about half of the errors due to injected faults are identified while still in speculative state. Errors detected this early can be eliminated by a pipeline flush. In this paper, we present several hardware-based versions of this screening algorithm and show that flushing the pipeline every time the hardware screener triggers reduces overall performance by less than 1%
Keywords :
fault location; fault tolerance; microprocessor chips; SpecInt; data inconsistencies; error detection; fault identification; perturbation-based fault screening; screening algorithm; simulated Pentium-III-like processor; transient fault; CMOS technology; Circuit faults; Costs; Error analysis; Fault diagnosis; Fault tolerance; Hardware; Microprocessors; Pipelines; Protection;
Conference_Titel :
High Performance Computer Architecture, 2007. HPCA 2007. IEEE 13th International Symposium on
Conference_Location :
Scottsdale, AZ
Print_ISBN :
1-4244-0805-9
Electronic_ISBN :
1-4244-0805-9
DOI :
10.1109/HPCA.2007.346195