• DocumentCode
    2366684
  • Title

    Primary-shadow consistency issues in the DRB scheme and the recovery time bound

  • Author

    Kim, K.H. ; Bacellar, Luiz ; Subbaraman, Chittur

  • Author_Institution
    Dept. of Electr. & Comput. Eng., California Univ., Irvine, CA, USA
  • fYear
    1996
  • fDate
    30 Oct-2 Nov 1996
  • Firstpage
    319
  • Lastpage
    329
  • Abstract
    The distributed recovery block (DRB) scheme is an approach for realizing both hardware and software fault tolerance in real time distributed and parallel computer systems. We point out that in order for the DRB scheme to yield a high fault coverage and a low recovery time bound, some important consistency requirements must be satisfied by the replicated application tasks in a DRB computing station. Newly identified approaches for meeting the consistency requirements, which involve, among other things, integration of network surveillance and reconfiguration (NSR) techniques with the DRB scheme, are presented. The paper then presents an analysis of the recovery time bound of the DRB scheme. The analysis is based on a modular structured concrete implementation model of the DRB scheme for local area network (LAN) based distributed computer systems, which is called the DRB/T LAN scheme and incorporates an NSR scheme and the newly identified consistency ensuring mechanisms. Finally, we consider approaches for applying the DRB scheme to new types of application computation segments that were not considered before and then discuss approaches for meeting the consistency requirements in such DRB stations. These approaches broaden the application range of the DRB scheme significantly
  • Keywords
    computer network reliability; fault tolerant computing; local area networks; parallel machines; parallel programming; real-time systems; reliability; software fault tolerance; system recovery; DRB scheme; DRB/T LAN scheme; application computation segments; consistency ensuring mechanisms; consistency requirements; distributed recovery block scheme; fault coverage; local area network based distributed computer systems; modular structured concrete implementation model; network surveillance and reconfiguration; parallel computer systems; primary shadow consistency issues; real time distributed systems; recovery time bound; replicated application tasks; software fault tolerance; Application software; Computer networks; Concrete; Concurrent computing; Distributed computing; Fault tolerant systems; Hardware; Local area networks; Real time systems; Surveillance;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Reliability Engineering, 1996. Proceedings., Seventh International Symposium on
  • Conference_Location
    White Plains, NY
  • Print_ISBN
    0-8186-7707-4
  • Type

    conf

  • DOI
    10.1109/ISSRE.1996.558888
  • Filename
    558888