Title :
Tolerating transient communication faults with online traffic scheduling
Author :
Marques, Luis ; Vasconcelos, Verónica ; Pedreiras, Paulo ; Almeida, Luis
Author_Institution :
ISEC, IPC, Portugal
Abstract :
Building distributed embedded systems that will be fault-free for all their lifetime is virtually impossible, thus the systems must deal with them if a continued correct behavior is needed. This is the case of safety-critical systems, such as X-by-wire systems in the automotive domain. Concerning transient communication faults in particular, they can be dealt with at various levels of the protocol stacks, with different techniques, e.g., temporal and spatial redundancy. In this paper we focus on temporal redundancy and we address the limitations imposed by typical time-triggered systems, commonly found in safety-critical systems, arising from their static traffic definition. In these systems the use of temporal redundancy to handle communication errors requires the pre-allocation of communication resources that, in the absence of errors, are wasted. Therefore, we propose an online traffic scheduling approach in which retransmissions are consistently scheduled with the remaining time-triggered traffic, using the unique flexibility provided by the FTT-CAN protocol (Flexible Time-Triggered communication on CAN). We address the integration of appropriate fault detectors in the FTT-CAN protocol to monitor the bus activity and re-schedule omitted messages. We show that this approach is more efficient than the static allocations, since communication resources are only allocated when necessary. We also discuss alternative realizations and validate the approach with initial results from a prototype implementation.
Keywords :
controller area networks; distributed processing; embedded systems; fault tolerant computing; protocols; scheduling; FTT-CAN protocol; X-by-wire systems; distributed embedded systems; flexible time-triggered communication; online traffic scheduling; safety-critical systems; temporal redundancy; time-triggered systems; transient communication fault tolerance; Reliability;
Conference_Titel :
Industrial Technology (ICIT), 2012 IEEE International Conference on
Conference_Location :
Athens
Print_ISBN :
978-1-4673-0340-8
DOI :
10.1109/ICIT.2012.6209970