• DocumentCode
    1013957
  • Title

    The SURE approach to reliability analysis

  • Author

    Butler, Ricky W.

  • Author_Institution
    NASA Langley Res. Center, Hampton, VA, USA
  • Volume
    41
  • Issue
    2
  • fYear
    1992
  • fDate
    6/1/1992 12:00:00 AM
  • Firstpage
    210
  • Lastpage
    218
  • Abstract
    The SURE computer program, a reliability-analysis tool for ultrareliable computer-system architectures, provides an efficient means for computing reasonably accurate upper and lower bounds for the death state probabilities of a large class of semi-Markov models. Once a semi-Markov model is described using a simple input language, SURE automatically computes the upper and lower bounds on the probability of system failure. A parameter of the model can be specified as a variable over a range of values, thus directing SURE to perform a sensitivity analysis automatically. The program provides a rapid computational capability for semi-Markov models useful for describing the fault-handling behavior of fault-tolerant computer systems. The only modeling restriction imposed by the program is that the nonexponential recovery transitions must be fast in comparison to the mission time. The SURE reliability-analysis method uses a fast bounding theorem based on means and variances and yields upper and lower bounds on the probability of system failure. Techniques have been developed to enable SURE to solve models with loops and calculate the operational-state probabilities. The computation is extremely fast, and large state-spaces can be directly solved; a pruning technique enables SURE to process extremely large models
  • Keywords
    Markov processes; fault tolerant computing; reliability theory; SURE computer program; death state probabilities; fast bounding theorem; fault-tolerant computer systems; lower bounds; mission time; nonexponential recovery transitions; operational-state probabilities; pruning technique; reliability analysis; semi-Markov model; sensitivity analysis; system failure probability; ultrareliable computer-system architectures; upper bounds; Differential equations; Digital systems; Fault tolerant systems; NASA; Operating systems; Probability; Reliability theory; Sensitivity analysis; System recovery; Voice mail;
  • fLanguage
    English
  • Journal_Title
    Reliability, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9529
  • Type

    jour

  • DOI
    10.1109/24.257783
  • Filename
    257783