• DocumentCode
    1686795
  • Title

    Inferring coverage probabilities by optimum 3-stage sampling

  • Author

    Constantinescu, Cristian

  • Author_Institution
    Duke Univ., Durham, NC, USA
  • fYear
    1996
  • Firstpage
    61
  • Lastpage
    65
  • Abstract
    Reliability assessment is an important step in the development of fault-tolerant computing systems. Availability, MTTF, and, in general, any reliability measure is determined by the system ability to handle faults and errors and the rate of occurrence of these events. A special parameter, the coverage probability, provides information about the effectiveness of the fault tolerance mechanisms embedded into the system. Practically, physical or simulated fault injection experiments are conducted for evaluating the coverage. Unfortunately, the extremely large number of events which can perturb the operation of a computing system makes exhaustive testing intractable. As a consequence, statistical inference has been employed to derive meaningful results after performing a relatively small number of fault injection experiments. This paper presents a new method for inferring the coverage probability by means of optimum 3-stage sampling. A three-dimensional space of events is considered. It is represented by the cross product of system inputs, times of injection, and fault locations. The fault injection consists of a pilot experiment followed by the main injection experiment. The sample size of the main experiment is chosen to minimize the cost of the fault injection for a fixed value of the variance. This approach is used for estimating the coverage probability of a hypothetical fault-tolerant system. Based on our experiments, we conclude that the optimum 3-stage sampling method is especially useful when a low variance of the coverage probability is required
  • Keywords
    fault diagnosis; fault location; fault tolerant computing; probability; reliability; MTTF; availability; cost minimisation; coverage probability; fault location; fault tolerance mechanisms; fault-tolerant computing systems; optimum 3-stage sampling; reliability assessment; simulated fault injection experiments; statistical inference; Cost function; Fault location; Fault tolerant systems; Fault trees; Humans; Petri nets; Probability; Sampling methods; System testing; Upper bound;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Reliability and Maintainability Symposium, 1996 Proceedings. International Symposium on Product Quality and Integrity., Annual
  • Conference_Location
    Las Vegas, NV
  • ISSN
    0149-144X
  • Print_ISBN
    0-7803-3112-5
  • Type

    conf

  • DOI
    10.1109/RAMS.1996.500643
  • Filename
    500643