DocumentCode :
3189882
Title :
Transparent software replication and hardware monitoring leveraging modern System-on-Chip features
Author :
Paulitsch, M. ; Nowotsch, Jan ; Munch, Dominik ; Girbinger, Ludwig
Author_Institution :
EADS Innovation Works, Munich, Germany
fYear :
2013
fDate :
19-21 Aug. 2013
Firstpage :
157
Lastpage :
164
Abstract :
Modern Commercial-Off-The-Shelf (COTS) System on-Chip (SoC) devices like multi-core computers have a variety of built-in features like Direct Memory Access (DMA) engines or sophisticated debug units. Using COTS devices in safety-critical environments like avionics requires replication, which can be based on diverse hardware to mitigate faults such as design errors or similar hardware to compensate for permanent and transient hardware faults e.g. due to single-event effects. This paper presents a novel approach of building fault-tolerant board architectures using chip-built-in features like debug units and implementing replication of application software components without the need of adaptation of application software. The advantages of the presented approach are the ability (1) to build fault-tolerant architectures relatively cheaply out of COTS components and (2) to separate the functional program from fault-tolerance-related code and, hence, also to include legacy code transparently. A demonstrator using two modern multicore processors connected by PCIe and debug units proves the feasibility of the described conceptual approach. Additional performance measurements quantify the benefit over commonly deployed software-based approaches.
Keywords :
fault tolerant computing; multiprocessing systems; program debugging; software engineering; system-on-chip; COTS SoC device; DMA engines; PCIe; application software; avionics; commercial-off-the-shelf; debug units; direct memory access; fault mitigation; fault-tolerance-related code; fault-tolerant board architectures; functional program; hardware monitoring; legacy code; multicore processors; peripheral component interconnect unit; permanent hardware fault; safety-critical environments; system-on-chip features; transient hardware fault; transparent software replication; Computer architecture; Fault tolerance; Hardware; Monitoring; Program processors; System-on-chip;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Embedded and Real-Time Computing Systems and Applications (RTCSA), 2013 IEEE 19th International Conference on
Conference_Location :
Taipei
ISSN :
1533-2306
Type :
conf
DOI :
10.1109/RTCSA.2013.6732215
Filename :
6732215
Link To Document :
بازگشت