• DocumentCode
    3351932
  • Title

    Markov models of fault-tolerant memory systems under SEU

  • Author

    Schiano, Luca ; Ottavi, Marco ; Lombardi, Fabrizio

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Northeastern Univ., Boston, MA, USA
  • fYear
    2004
  • fDate
    9-10 Aug. 2004
  • Firstpage
    38
  • Lastpage
    43
  • Abstract
    A single event upset (SEU) can affect the correct operation of digital systems, such as memories and processors. This paper proposes Markov based models for analyzing the reliability and availability of different fault-tolerant memory arrangements under the operational scenario of an SEU. These arrangements exploit redundancy (either duplex or triplex replication) for dynamic fault-tolerant operation as provided by arbitration (for error detection and output selection) as well as in the presence of dedicated circuitry implementing different correction/detection codes for bit-flips as errors. The primary objective is to preserve either the correctness, or the fail-safe nature of the data output of the memory system for long mission time. It is shown that a duplex memory system encoded with error control codes has a higher reliability than the triplex arrangement. Moreover, the use of a code for single error correction and double error detection (SEC-DED) improves both availability and reliability compared to an error correction code with same error detection capabilities.
  • Keywords
    Markov processes; error correction codes; error detection codes; fault tolerant computing; integrated circuit modelling; integrated circuit reliability; integrated memory circuits; redundancy; Markov model; bit-flips; digital systems; double error detection; duplex memory system; duplex replication; dynamic fault-tolerant operation; error control codes; error correction codes; error detection codes; fail-safe nature; fault-tolerant memory system; memories; processors; single error correction; single event upset; triplex replication; Availability; Circuits; Digital systems; Electrical fault detection; Error correction codes; Fault detection; Fault tolerance; Fault tolerant systems; Redundancy; Single event upset;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Memory Technology, Design and Testing, 2004. Records of the 2004 International Workshop on
  • ISSN
    1087-4852
  • Print_ISBN
    0-7695-2193-2
  • Type

    conf

  • DOI
    10.1109/MTDT.2004.1327982
  • Filename
    1327982