• DocumentCode
    1084338
  • Title

    Generating a fault-tolerant global clock using high-speed control signals for the MetaNet architecture

  • Author

    Ofek, Yoram

  • Author_Institution
    IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
  • Volume
    42
  • Issue
    5
  • fYear
    1994
  • fDate
    5/1/1994 12:00:00 AM
  • Firstpage
    2179
  • Lastpage
    2188
  • Abstract
    Describes a new technique, based on exchanging control signals between neighboring nodes, for constructing a stable and fault-tolerant global clock in a distributed system with an arbitrary topology. It is shown that it is possible to construct a global clock reference with a time step that is much smaller than the propagation delay over the network´s links. The synchronization algorithm ensures that the global clock “tick” has a stable periodicity, and therefore, it is possible to tolerate failures of links and clocks that operate faster and/or slower than nominally specified, as well as hard failures. The approach taken is to generate a global clock from the ensemble of the local transmission clocks and not to directly synchronize these high-speed clocks. The steady-state algorithm, which generates the global clock, is executed in hardware by the network interface of each node. At the network interface, it is possible to measure accurately the propagation delay between neighboring nodes with a small error or uncertainty and thereby to achieve global synchronization that is proportional to these error measurements. It is shown that the local clock drift (or rate uncertainty) has only a secondary effect on the maximum global clock rate. The synchronization algorithm can tolerate any physical failure. It will continue to operate correctly on any connected segment of the network, i.e., it can tolerate any number of link and node failures, as long as the network remains connected
  • Keywords
    clocks; fault tolerant computing; local area networks; reliability; synchronisation; telecommunications control; MetaNet architecture; distributed system; error measurements; fault-tolerant global clock; global clock reference; global synchronization; hard failures; high-speed control signals; link failures; local transmission clocks; network interface; node failures; propagation delay; rate uncertainty; stable periodicity; steady-state algorithm; synchronization algorithm; time step; topology; Clocks; Control systems; Fault tolerance; Fault tolerant systems; Hardware; Network interfaces; Network topology; Propagation delay; Steady-state; Synchronization;
  • fLanguage
    English
  • Journal_Title
    Communications, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0090-6778
  • Type

    jour

  • DOI
    10.1109/26.285154
  • Filename
    285154