Title :
Trace-based microarchitecture-level diagnosis of permanent hardware faults
Author :
Li, Man-Lap ; Ramachandran, Pradeep ; Sahoo, Swamp K. ; Adve, Sarita V. ; Adve, V.S. ; Zhou, Yuanyuan
Author_Institution :
Dept. of Comput. Sci., Illinois Univ., Champaign, IL
Abstract :
As devices continue to scale, future shipped hardware will likely fail due to in-the-field hardware faults. As traditional redundancy-based hardware reliability solutions that tackle these faults will be too expensive to be broadly deployable, recent research has focused on low-overhead reliability solutions. One approach is to employ low-overhead (ldquoalways-onrdquo) detection techniques that catch high-level symptoms and pay a higher overhead for (rarely invoked) diagnosis. This paper presents trace-based fault diagnosis, a diagnosis strategy that identifies permanent faults in microarchitectural units by analyzing the faulty corepsilas instruction trace. Once a fault is detected, the faulty core is rolled back and re-executes from a previous checkpoint, generating a faulty instruction trace and recording the microarchitecture-level resource usage. A diagnosis process on another fault-free core then generates a fault-free trace which it compares with the faulty trace to identify the faulty unit. Our result shows that this approach successfully diagnoses 98% of the faults studied and is a highly robust and flexible way for diagnosing permanent faults.
Keywords :
computer architecture; fault diagnosis; fault tolerance; instruction sets; logic design; logic testing; microprocessor chips; checkpointing; instruction trace-based microarchitecture-level fault diagnosis; microarchitecture-level resource usage; permanent hardware fault; processor-level redundancy-based hardware reliability solution; Circuit faults; Computer science; Fault detection; Fault diagnosis; Hardware; Microarchitecture; Monitoring; Moore´s Law; Pervasive computing; Robustness;
Conference_Titel :
Dependable Systems and Networks With FTCS and DCC, 2008. DSN 2008. IEEE International Conference on
Conference_Location :
Anchorage, AK
Print_ISBN :
978-1-4244-2397-2
Electronic_ISBN :
978-1-4244-2398-9
DOI :
10.1109/DSN.2008.4630067