• DocumentCode
    564967
  • Title

    Preventing state divergence in duplex systems using causal memory

  • Author

    Chitsaz, Behzad ; Razzazi, Mohammadreza

  • Author_Institution
    Dept. of Comput. Eng. & Inf. Technol., Amirkabir Univ. of Technol., Tehran, Iran
  • fYear
    2012
  • fDate
    21-25 May 2012
  • Firstpage
    257
  • Lastpage
    261
  • Abstract
    Replicated execution of distributed programs provides a means of masking hardware or software failures in a distributed system. Application level entities (processes, objects) are replicated to execute on distinct processors. Such replica entities communicate via message-passing. Non-determinism within the replicas could cause messages to be processed in non-identical order, producing a divergence of state. The replicas could thereafter produce inconsistent responses to identical messages and hence appear to be faulty. The partial-order model of distributed computations based on the happened-before relation like primary-backup approach has been criticized for allowing false causality between messages, the false causality causes unnecessary blocking processes and results in high time overhead for replicating entities. In this paper we use the concept of causal memories and multi version states to reduce the false causality between messages. We capture the read/write operations on the variables of each process to find out the dependencies between messages, and save some old values of variables to use in cases the read that operations may cause divergence in the states of replicas. The results of simulation show that this approach has lower execution time than the primary-backup approach.
  • Keywords
    distributed programming; message passing; replicated databases; software architecture; software fault tolerance; storage management; system recovery; application level entities; causal memory; distributed computations; distributed programs; duplex systems; false causality reduction; hardware failure masking; message-passing; nonidentical order; primary-backup approach; read-write operations; replicated execution; software failure masking; state divergence prevention; Computational modeling; Data structures; Delay; Fault tolerance; Fault tolerant systems; History; Memory management; Causal Memory; Replicated Distributed Systems; Replication Consistency;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    MIPRO, 2012 Proceedings of the 35th International Convention
  • Conference_Location
    Opatija
  • Print_ISBN
    978-1-4673-2577-6
  • Type

    conf

  • Filename
    6240652