Title :
Modeling fault-tolerant mobile agent execution as a sequence of agreement problems
Author :
Pleisch, Stefan ; Schiper, André
Author_Institution :
IBM Zurich Res. Lab., Ruschlikon, Switzerland
Abstract :
Fault tolerance is fundamental to the further development of mobile agent applications. In the context of mobile agents, fault tolerance prevents a partial or complete loss of the agent, i.e. ensures that the agent arrives at its destination. Simple approaches such as checkpointing are prone to blocking. Replication can in principle improve solutions based on checkpointing. However existing solutions in this context either assume a perfect failure detection mechanism (which is not realistic in an environment such as the Internet), or rely on complex solutions based on leader election and distributed transactions, where only a subset of solutions prevents blocking. The paper proposes a novel approach to fault tolerant mobile agent execution, which is based on modeling agent execution as a sequence of agreement problems. Each agreement problem is one instance of the well understood consensus problem. Our solution does not require a perfect failure detection mechanism, while preventing blocking and ensuring that the agent is executed exactly once
Keywords :
fault tolerant computing; mobile computing; system recovery; agreement problems; checkpointing; consensus problem; failure detection mechanism; fault tolerance; fault tolerant mobile agent execution; fault-tolerant mobile agent execution modeling; mobile agent applications; Checkpointing; Delay; Fault tolerance; Fault tolerant systems; Laboratories; Mechanical factors; Mobile agents; Nominations and elections; Operating systems; Uncertainty;
Conference_Titel :
Reliable Distributed Systems, 2000. SRDS-2000. Proceedings The 19th IEEE Symposium on
Conference_Location :
Nurnberg
Print_ISBN :
0-7695-0543-0
DOI :
10.1109/RELDI.2000.885388