DocumentCode :
2177727
Title :
Optimal design configurations of fault-tolerant systems
Author :
Amari, S.V.
Author_Institution :
Parametric Technol. Corp., Greensburg, PA, USA
fYear :
2013
fDate :
28-31 Jan. 2013
Firstpage :
1
Lastpage :
6
Abstract :
Fault tolerance is an essential architectural attribute for achieving high reliability in many critical applications of digital systems. Automatic recovery and reconfiguration mechanisms play a crucial role in implementing fault tolerance because an uncovered fault may lead to a system or subsystem failure even when adequate redundancy exists. An excessive level of redundancy may even reduce the system reliability in addition to consuming system resources. Therefore, an accurate reliability analysis must account for not only the system structure but also the system fault and error handling behavior. The models that capture the fault and error handling behavior are called coverage models. The appropriate coverage modeling approach depends on the type of fault-tolerant techniques used. This paper describes and demonstrates a solution methodology that determines optimal design configurations that maximize the reliability of fault-tolerant systems subject to imperfect fault coverage and resource constraints. It is assumed that the system consists of several subsystems in series where each subsystem contains multiple redundant components. The problem formulation considers the generic type of fault-tolerant mechanisms and associated coverage models for each subsystem. The objective of the optimal design is to select the design configuration, type of components, and fault-tolerant mechanism for each subsystem from the applicable/available choices. Optimal solutions are determined based on an equivalent problem formulation and integer programming. The methodology presented here is flexible and can accurately model a wide range of faulttolerant systems used in safety-critical applications. The methodology is successfully demonstrated on a large problem with 14 subsystems and 4 component choices for each subsystem.
Keywords :
design; fault tolerance; reliability; coverage modeling approach; design configuration; error handling behavior; fault tolerant system; reconfiguration mechanism; recovery mechanism; reliability analysis; subsystem failure; Fault tolerant systems; Linear programming; Mathematical model; Redundancy; Reliability engineering; coverage factor; fault-tolerant systems; redundancy optimization; resource constraints; system reliability;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Reliability and Maintainability Symposium (RAMS), 2013 Proceedings - Annual
Conference_Location :
Orlando, FL
ISSN :
0149-144X
Print_ISBN :
978-1-4673-4709-9
Type :
conf
DOI :
10.1109/RAMS.2013.6517667
Filename :
6517667
Link To Document :
بازگشت