• DocumentCode
    969421
  • Title

    Fully distributed three-tier active software replication

  • Author

    Marchetti, Carlo ; Baldoni, Roberto ; Tucci-Piergiovanni, Sara ; Virgillito, Antonino

  • Author_Institution
    Dipt. di Informatica e Sistemistica, Universita degli Studi di Roma "La Sapienza"
  • Volume
    17
  • Issue
    7
  • fYear
    2006
  • fDate
    7/1/2006 12:00:00 AM
  • Firstpage
    633
  • Lastpage
    645
  • Abstract
    Keeping strongly consistent the state of the replicas of a software service deployed across a distributed system prone to crashes and with highly unstable message transfer delays (e.g., the Internet), is a real practical challenge. The solution to this problem is subject to the FLP impossibility result, and thus there is a need for "long enough" periods of synchrony with time bounds on process speeds and message transfer delays to ensure deterministic termination of any run of agreement protocols executed by replicas. This behavior can be abstracted by a partially synchronous computational model. In this setting, before reaching a period of synchrony, the underlying network can arbitrarily delay messages and these delays can be perceived as false failures by some timeout-based failure detection mechanism leading to unexpected service unavailability. This paper proposes a fully distributed solution for active software replication based on a three-tier software architecture well-suited to such a difficult setting. The formal correctness of the solution is proved by assuming the middle-tier runs in a partially synchronous distributed system. This architecture separates the ordering of the requests coming from clients, executed by the middle-tier, from their actual execution, done by replicas, i.e., the end-tier. In this way, clients can show up in any part of the distributed system and replica placement is simplified, since only the middle-tier has to be deployed on a well-behaving part of the distributed system that frequently respects synchrony bounds. This deployment permits a rapid timeout tuning reducing thus unexpected service unavailability
  • Keywords
    fault diagnosis; fault tolerant computing; formal verification; message passing; software architecture; active software replication; message transfer delays; software service; synchronous computational model; synchronous distributed system; three-tier software architecture; timeout-based failure detection mechanism; Computational modeling; Computer architecture; Computer crashes; Delay effects; Distributed computing; Protocols; Software architecture; Software systems; Timing; Web and internet services; Dependable distributed systems; architectures for dependable services.; replication protocols; software replication in wide-area networks;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/TPDS.2006.89
  • Filename
    1642640