• DocumentCode
    1366614
  • Title

    The effect of program behavior on fault observability

  • Author

    Bowen, Nicholas S. ; Pradhan, E. Dhiraj K

  • Author_Institution
    IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
  • Volume
    45
  • Issue
    8
  • fYear
    1996
  • fDate
    8/1/1996 12:00:00 AM
  • Firstpage
    868
  • Lastpage
    880
  • Abstract
    Fault observability based on the behavior of memory references is studied. Traditional studies view memory as one monolithic entity that must completely work to be considered reliable. The usage patterns of a particular program´s memory are emphasized here. This paper develops a new model for the successful execution of a program taking into account the usage of the data by extending a cache memory performance model. Three variations, based on well known allocation schemes, are presented (i.e., whether the program´s storage is preallocated, dynamically allocated, or constrained in allocation). This is contrasted to traditional memory reliability calculations to show that the actual mean time to failure may be more optimistic when program behavior is considered. It also develops expressions for the probability of unobserved faults. With several studies reporting correlations between increased workloads and increased failure rates, a new theory is proposed here that provides an explanation for this behavior. The model studies several program traces demonstrating that increased workloads could cause an increase of the observed failure rates in the range of 32% to 53%
  • Keywords
    cache storage; fault tolerant computing; storage allocation; allocation schemes; cache memory performance model; fault observability; memory references; program behavior; program traces; Computer errors; Computer science; Error correction codes; Failure analysis; Fault tolerance; Observability; Predictive models; Random access memory; Read-write memory; Reliability;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/12.536230
  • Filename
    536230