DocumentCode :
2364647
Title :
Determination of an optimal retry time in multiple-module computing systems
Author :
Hou, Chao-Ju ; Shin, Kang G.
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Michigan Univ., Ann Arbor, MI, USA
fYear :
1993
fDate :
25-28 Apr 1993
Firstpage :
294
Lastpage :
301
Abstract :
The optimal amount of time used for retrying an instruction on detection of an error in a computing system is usually determined under the assumption that the system is composed of a single module, within which all fault activities are confined until some module-replacement action is taken. The authors consider fault activities in multiple-module systems. They first relax the single-module assumption and model the fault activities in a multiple-module system as a Markov process. The randomization approach is applied to decompose the Markov process into a discrete-time Markov chain subordinated to a Poisson process. Using this decomposition, several interesting measures can be derived such as the conditional probability of successful retry given a retry period and the fact that a non-permanent fault has occurred, and the mean time until which all modules in the system enter a fault-free state. All the measures derived are used to determine, along with the parameters characterizing fault activities and costs of recovery techniques, whether or not retry should be used as a first-step recovery means on detection of an error; and the best retry period subject to a specific probability of successful retry
Keywords :
Markov processes; fault tolerant computing; optimisation; performance evaluation; probability; reliability; stochastic processes; system recovery; Markov process; Poisson process; conditional probability; discrete-time Markov chain; error detection; fault activities; fault-free state; module-replacement action; multiple-module computing systems; optimal retry time; randomization approach; single-module assumption; system recovery; Chaos; Computer aided instruction; Computer errors; Computer science; Electrical fault detection; Laboratories; Markov processes; Position measurement; Real time systems; Time measurement;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Uncertainty Modeling and Analysis, 1993. Proceedings., Second International Symposium on
Conference_Location :
College Park, MD
Print_ISBN :
0-8186-3850-8
Type :
conf
DOI :
10.1109/ISUMA.1993.366753
Filename :
366753
Link To Document :
بازگشت