Title :
Recovery method based on communicating extended finite state machine (CEFSM) for mobile communications
Author :
Beiroumi, Mohammad Zib ; Iversen, Villy Baek
Author_Institution :
Motorola CGISS, Glostrup, Denmark
Abstract :
Fast recovery from software and hardware failures is very essential to communication systems, especially, when it is used for mission-critical applications such as public safety systems. A failure in the network infrastructure can affect a large number of users and may result in loss of lives. The infrastructure software applications that provide services to the mobile stations according to some defined communication protocols play a key role for system availability. The real-time peer-to-peer nature of these communication protocols poses a real challenge in developing a recovery mechanism that can work in such environments. In this paper, we introduce a new recovery method that takes into account the layered architecture of the communication protocols and their peer-to-peer communication pattern. The method is based on communicating extended finite state machine and does not assume transient and fail-stop failures. Furthermore, an experimental testbed has been implemented to evaluate our new approach. The experimental results have shown that the infrastructure applications can reliably recover and quickly restore the servicing level that the system was performing immediately prior to the failure. Moreover, the failure-free overhead caused by this approach is relatively low, and is experimentally found to be less than 5%.
Keywords :
finite state machines; mobile communication; mobile computing; peer-to-peer computing; protocols; system recovery; telecommunication network management; telecommunication network reliability; telecommunication network topology; CEFSM; communicating extended finite state machine; fail-stop failure; hardware failure; infrastructure software applications; layered architecture; mission-critical applications; mobile communications; mobile stations; network infrastructure; public safety systems; real-time peer-to-peer protocol; service restoration; software failure; software recovery; system availability; system recovery; transient failure; Application software; Automata; Availability; Communication system software; Hardware; Mission critical systems; Mobile communication; Peer to peer computing; Protocols; Software safety;
Conference_Titel :
Engineering of Complex Computer Systems, 2005. ICECCS 2005. Proceedings. 10th IEEE International Conference on
Print_ISBN :
0-7695-2284-X
DOI :
10.1109/ICECCS.2005.70