DocumentCode :
1393580
Title :
Maximizing mean-time to failure in k-resilient systems with repair
Author :
Fridman, José ; Rangarajan, Sampath
Author_Institution :
Dept. of Electr. & Comput. Eng., Northeastern Univ., Boston, MA, USA
Volume :
46
Issue :
2
fYear :
1997
fDate :
2/1/1997 12:00:00 AM
Firstpage :
229
Lastpage :
234
Abstract :
A k-resilient system with N components can tolerate up to k component failures and still function correctly. We consider k-resilient systems where the number of component failures is a constant fraction of the total number of components, that is k=N/c and c is a constant such that 2⩽c<∞. Under a Markovian assumption of constant failure and repair rates, we compute the system size Nmax at which the mean-time to failure (MTTF) for such a system is maximized. Our results indicate that Nmax can be expressed in terms of constant c and parameter ρ as Nmax=K(c,ρ)/ρ, where ρ=λ/μ and K(c, ρ) is a function of c,ρ. In addition, we have found that the variation of Nmax over the whole range of c is remarkably small, and as a result, even if the resilience k of a system as a function of N varies widely, the system size at which the MTTF is maximized is within the range 0.36/ρ and 0.5/ρ. We validate our results through event-driven simulation, and, in addition, examine the behavior of systems with Weibull distributed failure times
Keywords :
Markov processes; discrete event simulation; fault tolerant computing; Markovian assumption; Weibull distributed failure times; component failures; event-driven simulation; k-resilient systems with repair; mean-time to failure; system size; Discrete event simulation; Protocols; Resilience; Software performance; Software systems; Voting; Weibull distribution;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/12.565606
Filename :
565606
Link To Document :
بازگشت