Title :
An efficient real time fault detection and tolerance framework validated on the intel SCC processor
Author :
Rai, Dipendra ; Pengcheng Huang ; Stoimenov, Nikolay ; Thiele, Lothar
Author_Institution :
Comput. Eng. & Networks Lab., ETH Zurich, Zurich, Switzerland
Abstract :
We present a new framework that efficiently detects and tolerates timing faults in real time systems. Timing faults are observed when the inputs and/or outputs of a given system fail to meet their desired timing properties, such as I/O rates. Most current approaches either rely on heartbeat monitoring which is too restrictive; or on statistical or inexact methods which are not suitable for embedded real time systems. Current approaches based on the abstract real time model of the given application are resource intensive, and may not be suitable for embedded systems. Our framework utilizes active replication, and is based on already existing timing models for real time applications to develop fault detection and tolerance strategies. The approach does not require any timekeeping at runtime, and is efficient in terms of computational resources used. Experiments using three realistic applications on the Intel Baremetal SCC demonstrate the efficiency of our framework, both in memory and computational resources used.
Keywords :
embedded systems; fault diagnosis; fault tolerant computing; input-output programs; microprocessor chips; I-O rates; Intel Baremetal SCC processor; active replication; computational resources; embedded real time systems; real time fault detection framework; real time fault tolerance framework; timing faults; Decoding; Fault detection; Fault tolerance; Fault tolerant systems; Monitoring; Real-time systems; Timing;
Conference_Titel :
Design Automation Conference (DAC), 2014 51st ACM/EDAC/IEEE
Conference_Location :
San Francisco, CA
DOI :
10.1145/2593069.2593085