• DocumentCode
    1666556
  • Title

    Performance analysis of a fault-tolerant distributed-shared memory protocol on the SOME-bus multiprocessor architecture

  • Author

    Hecht, Diana ; Katsinis, Constantine

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Drexel Univ., Philadelphia, PA, USA
  • fYear
    2003
  • Abstract
    Interconnection networks allowing multiple simultaneous broadcasts are becoming feasible, mostly due to advances in fiber-optics and VLSI technology. Distributed-shared-memory implementations on such networks promise high performance even for applications with small granularity. This paper summarizes the architecture of one such implementation, the simultaneous optical multiprocessor exchange bus, and examines the performance of an augmented DSM protocol which provides fault tolerance by exploiting the natural DSM replication of data in order to maintain a recovery memory in each processing node. Theoretical and simulation results show that the additional data replication necessary to create fault-tolerant DSM causes no reduction in system performance during normal operation and eliminates most of the overhead at checkpoint creation. Data blocks which are duplicated to maintain the recovery memory may be utilized by the regular DSM protocol, reducing network traffic, and increasing the processor utilization significantly.
  • Keywords
    digital simulation; distributed shared memory systems; fault tolerant computing; multiprocessing systems; optical computing; performance evaluation; SOME-bus multiprocessor architecture; data replication; distributed-shared-memory implementations; fault-tolerant distributed-shared memory protocol; interconnection networks; multiple simultaneous broadcasts; performance analysis; processor utilization; simulation results; simultaneous optical multiprocessor exchange bus; Broadcast technology; Broadcasting; Fault tolerance; Fault tolerant systems; Multiprocessor interconnection networks; Optical interconnections; Performance analysis; Protocols; System performance; Very large scale integration;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium, 2003. Proceedings. International
  • ISSN
    1530-2075
  • Print_ISBN
    0-7695-1926-1
  • Type

    conf

  • DOI
    10.1109/IPDPS.2003.1213389
  • Filename
    1213389