• DocumentCode
    1844815
  • Title

    Application-transparent fault tolerance in distributed systems

  • Author

    Becker, Thomas

  • Author_Institution
    Dept. of Comput. Sci., Kaiserslautern Univ., Germany
  • fYear
    1994
  • fDate
    21-23 Mar 1994
  • Firstpage
    36
  • Lastpage
    45
  • Abstract
    We present a new software architecture in which all concepts necessary to achieve fault tolerance can be added to an application automatically without any source code changes. As a case study, we consider the problem of providing a reliable service despite node failures by executing a group of replicated servers. Replica creation and management as well as failure detection and recovery are performed automatically by a separate fault tolerance layer (ft-layer) which is inserted between the server application and the operating system kernel. The layer is invisible for the application since it provides the same functional interface as the operating system kernel, thus making the fault tolerance property of the service completely transparent for the application. A major advantage of our architecture is that the layer encapsulates both fault tolerance mechanisms and policies. This allows for maximum flexibility in the choice of appropriate methods for fault tolerance without any changes in the application code
  • Keywords
    distributed processing; fault tolerant computing; operating systems (computers); software engineering; software reliability; application code; application-transparent fault tolerance; distributed systems; failure detection; failure recovery; fault tolerance layer; functional interface; node failures; operating system kernel; reliable service; replicated servers; server application; software architecture; source code change; Computer science; Fault tolerance; Fault tolerant systems; Joining processes; Kernel; Libraries; Operating systems; Programming profession; Software architecture; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Configurable Distributed Systems, 1994., Proceedings of 2nd International Workshop on
  • Conference_Location
    Pittsburgh, PA
  • Print_ISBN
    0-8186-5390-6
  • Type

    conf

  • DOI
    10.1109/IWCDS.1994.289937
  • Filename
    289937