• DocumentCode
    1146752
  • Title

    Derivation and Calibration of a Transient Error Reliability Model

  • Author

    Castillo, Xavier ; McConnel, Stephen R. ; Siewiorek, Daniel P.

  • Author_Institution
    Asociacion I.T.P.
  • Issue
    7
  • fYear
    1982
  • fDate
    7/1/1982 12:00:00 AM
  • Firstpage
    658
  • Lastpage
    671
  • Abstract
    In this paper a new modeling methodology to characterize failure processes in digital computers due to hardware transients is presented. The basic assumption made is that system sensitivity to hardware transient errors is a function of critical resources usage. The failure rate of a given resource is approximated by a deterministic function of time, depending on the average workload of that resource, plus a Gaussian process. The probability density function of the time to failure obtained under this assumption has a decreasing hazard function, explaining why decreasing hazard function densities such as the Weibull fit experimental data so well. Data on transient errors obtained from several systems are analyzed. Statistical tests confirm the good fit between decreasing hazard distributions and actual data. Finally, models of common fault-tolerant redundant structures are developed using decreasing hazard function distributions. The analysis indicates significant differences between reliability predictions based on the exponential distribution and those based on decreasing hazard function distributions. Reliability differences of 0.2 and factors greater than 2 in Mission Time Improvement are seen in model results. System designers should be aware of these differences.
  • Keywords
    Decreasing hazard function distributions; Weibull distribution; redundant systems; reliability modelig; reliability prediction; system simulation; transient faults; Calibration; Computer errors; Exponential distribution; Fault tolerance; Gaussian processes; Hardware; Hazards; Probability density function; Testing; Transient analysis; Decreasing hazard function distributions; Weibull distribution; redundant systems; reliability modelig; reliability prediction; system simulation; transient faults;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/TC.1982.1676063
  • Filename
    1676063