Title :
A low-latency checkpointing scheme for mobile computing systems
Author :
Li, Guohui ; Shu, LihChyun
Author_Institution :
Sch. of Comput. Sci. & Technol., Huazhong Univ. of Sci. & Technol., Wuhan, China
Abstract :
Fault-tolerant mobile computing systems have different requirements and restrictions, not taken into account by conventional distributed systems. This paper presents a coordinated checkpointing scheme which reduces the delay involved in a global checkpointing process for mobile systems. A piggyback technique is used to track and record the checkpoint dependency information among processes during normal message transmission. During checkpointing, a concurrent checkpointing technique is designed to use the pre-recorded process dependency information to minimize process blocking time by sending checkpoint requests to dependent processes at once, hence saving the time to trace the dependency tree. Our checkpoint algorithm forces a minimum number of processes to take checkpoints. Via probability-based analysis, we show that our scheme can significantly reduce the latency associated with checkpoint request propagation, compared to traditional coordinated checkpointing approach.
Keywords :
checkpointing; concurrency control; message passing; mobile computing; optimisation; probability; program diagnostics; software fault tolerance; checkpoint algorithm; checkpoint dependency information; concurrent checkpointing; coordinated checkpointing; dependency tree; distributed systems; fault-tolerant mobile computing systems; low-latency checkpointing; message transmission; piggyback technique; probability-based analysis; process blocking time; requirements; restrictions; Bandwidth; Character recognition; Checkpointing; Computer science; Delay; Distributed computing; Energy conservation; Fault tolerant systems; Home appliances; Mobile computing;
Conference_Titel :
Computer Software and Applications Conference, 2005. COMPSAC 2005. 29th Annual International
Print_ISBN :
0-7695-2413-3
DOI :
10.1109/COMPSAC.2005.26