DocumentCode
665563
Title
Towards fast OS rejuvenation: An experimental evaluation of fast OS reboot techniques
Author
Bovenzi, Antonio ; Alonso, J. Marcos ; Yamada, Hiroyoshi ; Russo, S. ; Trivedi, Kishor S.
fYear
2013
fDate
4-7 Nov. 2013
Firstpage
61
Lastpage
70
Abstract
Continuous or high availability is a key requirement for many modern IT systems. Computer operating systems play an important role in IT systems availability. Due to the complexity of their architecture, they are prone to suffer failures due to several types of software faults. Software aging causes a nonnegligible fraction of these failures. It leads to an accumulation of errors with time, increasing the system failure rate. This phenomenon can be accompanied by performance degradation and eventually system hang or even crash. As a countermeasure, software rejuvenation entails stopping the system, cleaning its internal state, and resuming its operation. This process usually incurs downtime. For an operating system, the downtime impacts any application running on top of it. Several solutions have been developed to speed up the boot time of operating systems in order to reduce the downtime overhead. We present a study of two fast OS reboot techniques for rejuvenation of Linux-based operating systems, namely Kexec and Phase-based reboot. The study measures the performance penalty they introduce and the gain in reduction of downtime overhead. The results reveal that the Kexec and Phase-based reboot have no statistically significant impact in terms of performance penalty from the user perspective. However, they may require extra resource (e.g., CPU) usage. The downtime overhead reduction, compared with normal Linux and VM reboots, is 77% and 79% in Kexec and Phase-based reboot, respectively.
Keywords
Linux; software architecture; software metrics; software performance evaluation; system recovery; virtual machines; IT systems availability; Kexec reboot; Linux-based operating systems; VM reboots; architecture complexity; computer operating systems; downtime overhead reduction; fast OS reboot techniques; fast OS rejuvenation; performance penalty; phase-based reboot; software aging; software rejuvenation; system failure rate; Aging; Hardware; Image restoration; Kernel; Linux;
fLanguage
English
Publisher
ieee
Conference_Titel
Software Reliability Engineering (ISSRE), 2013 IEEE 24th International Symposium on
Conference_Location
Pasadena, CA
Type
conf
DOI
10.1109/ISSRE.2013.6698905
Filename
6698905
Link To Document