Title :
Reliability MicroKernel: Providing Application-Aware Reliability in the OS
Author :
Wang, Long ; Kalbarczyk, Zbigniew ; Gu, Weining ; Iyer, Ravishankar K.
Author_Institution :
Univ. of Illinois at Urbana-Champaign, Urbana-Champaign
Abstract :
This paper describes the reliability MicroKernel (RMK) framework, a loadable kernel module (or a device driver) for providing application-aware reliability, and dynamically configuring reliability mechanisms. Characteristics of application/system execution are exploited transparently through application-aware reliability techniques to achieve low-latency detection, and low-overhead checkpointing. The RMK prototype is implemented in both Linux, and Windows; and it supports detection of application/OS failures, and transparent application checkpointing. Experiment results show that the system hang detection and application hang detection, which exploit characteristics of application, and system behavior, can achieve high coverage (100% observed in our experiments) with a low false positive rate. Moreover, the performance overhead of RMK, and its detection/checkpointing mechanisms, is small: 0.6% for application hang detection, and 0.1% for transparent application checkpointing in the experiments.
Keywords :
Linux; checkpointing; operating system kernels; reliability; Linux; Windows; application hang detection; application-aware reliability; operating system; reliability microkernel; transparent application checkpointing; Application software; Checkpointing; Computer architecture; Computer crashes; Hardware; Kernel; Linux; Monitoring; Operating systems; Pins; Application aware reliability; OS-level error detection; system crash/hang detection; transparent application checkpointing;
Journal_Title :
Reliability, IEEE Transactions on
DOI :
10.1109/TR.2007.909758