Title :
LiVe: Timely error detection in light-lockstep safety critical systems
Author :
Hernandez, C. ; Abella, Jaume
Author_Institution :
Barcelona Supercomput. Center (BSC-CNS), Barcelona, Spain
Abstract :
Safety-critical systems rely on features such as lockstep execution for error detection, and reset and reexecution for error correction. In particular, light lockstep is an attractive choice since it does not require redesigning cores but, instead, comparing off-core activities (i.e. data/addresses sent). While this approach suffices to guarantee functional correctness of the system, as needed for certification against safety standards (e.g., ISO26262), it fails to provide any timing guarantee as the time elapsed since the error occurs until lockstep detects it can be inordinately large. In this paper (i) we analyse the timing behaviour of errors in light lockstep systems, showing that a significant fraction of errors may remain undetected for long periods. Then, (ii) we put this problem in the context of certification against safety standards. Finally, (iii) we propose LiVe (Lightly Verbose), an approach to guarantee timely detection of errors at low cost in the context of light lockstep systems.
Keywords :
certification; embedded systems; error correction codes; error detection codes; fault tolerant computing; safety-critical software; certification; error correction; light lockstep safety-critical system; safety standards; timely error detection; timing behaviour; Automotive engineering; Circuit faults; Hardware; Program processors; Registers; Safety; Timing; Automotive; Error detection; Lockstep; Real-time;
Conference_Titel :
Design Automation Conference (DAC), 2014 51st ACM/EDAC/IEEE
Conference_Location :
San Francisco, CA
DOI :
10.1145/2593069.2593155