Title of article :
Software fault detection and recovery in critical real-time systems: An approach based on loose coupling
Author/Authors :
Alho، نويسنده , , Pekka and Mattila، نويسنده , , Jouni، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2014
Pages :
6
From page :
2272
To page :
2277
Abstract :
Remote handling (RH) systems are used to inspect, make changes to, and maintain components in the ITER machine and as such are an example of mission-critical system. Failure in a critical system may cause damage, significant financial losses and loss of experiment runtime, making dependability one of their most important properties. However, even if the software for RH control systems has been developed using best practices, the system might still fail due to undetected faults (bugs), hardware failures, etc. Critical systems therefore need capability to tolerate faults and resume operation after their occurrence. However, design of effective fault detection and recovery mechanisms poses a challenge due to timeliness requirements, growth in scale, and complex interactions. In this paper we evaluate effectiveness of service-oriented architectural approach to fault tolerance in mission-critical real-time systems. We use a prototype implementation for service management with an experimental RH control system and industrial manipulator. The fault tolerance is based on using the high level of decoupling between services to recover from transient faults by service restarts. In case the recovery process is not successful, the system can still be used if the fault was not in a critical software module.
Keywords :
ITER , remote handling , Software , dependability , Fault tolerance , Real-time
Journal title :
Fusion Engineering and Design
Serial Year :
2014
Journal title :
Fusion Engineering and Design
Record number :
2362948
Link To Document :
بازگشت