• DocumentCode
    1465729
  • Title

    Implementing fail-silent nodes for distributed systems

  • Author

    Brasileiro, Francisco V. ; Ezhilchelvan, Paul Devadoss ; Shrivastava, Santosh K. ; Speirs, Neil A. ; Tao, S.

  • Author_Institution
    Dept. de Sistemas e Computacao, Univ. Federal da Paraiba, Joao Pessoa, Brazil
  • Volume
    45
  • Issue
    11
  • fYear
    1996
  • fDate
    11/1/1996 12:00:00 AM
  • Firstpage
    1226
  • Lastpage
    1238
  • Abstract
    A fail-silent node is a self-checking node that either functions correctly or stops functioning after an internal failure is detected. Such a node can be constructed from a number of conventional processors. In a software-implemented fail-silent node, the nonfaulty processors of the node need to execute message order and comparison protocols to “keep in step” and check each other, respectively. In this paper, the design and implementation of efficient protocols for a two processor fail-silent node are described in detail. The performance figures obtained indicate that in a wide class of applications requiring a high degree of fault tolerance, software-implemented fail-silent nodes constructed simply by utilizing standard “off-the-shelf” components are an attractive alternative to their hardware-implemented counterparts that do require special-purpose hardware components, such as fault-tolerant clocks, comparator, and bus interface circuits
  • Keywords
    distributed processing; fault tolerant computing; multiprocessing systems; protocols; reliability; software fault tolerance; bus interface circuits; comparator; distributed systems; fail-silent nodes; fault tolerance; fault-tolerant clocks; hardware-implemented counterparts; internal failure; performance figures; self-checking node; software-implemented fail-silent node; software-implemented fail-silent nodes; special-purpose hardware components; two processor fail-silent node; Circuits; Clocks; Hardware; Logic; Master-slave; Protocols; Software design; Synchronization; Voting;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/12.544479
  • Filename
    544479