DocumentCode
3335424
Title
Software rejuvenation: analysis, module and applications
Author
Huang, Y. ; Kintala, C. ; Kolettis, N. ; Fulton, N.D.
Author_Institution
AT&T Bell Labs., Murray Hill, NJ, USA
fYear
1995
fDate
27-30 June 1995
Firstpage
381
Lastpage
390
Abstract
Software rejuvenation is the concept of gracefully terminating an application and immediately restarting it at a clean internal state. In a client-server type of application where the server is intended to ran perpetually for providing a service to its clients, rejuvenating the server process periodically during the most idle time of the server increases the availability of that service. In a long-running computation-intensive application, rejuvenating the application periodically and restarting it at a previous checkpoint increases the likelihood of successfully completing the application execution. We present a model for analyzing software rejuvenation in such continuously-running applications and express downtime and costs due to downtime during rejuvenation in terms of the parameters in that model. Threshold conditions for rejuvenation to be beneficial are also derived. We implemented a reusable module to perform software rejuvenation. That module can be embedded in any existing application on a UNIX platform with minimal effort. Experiences with software rejuvenation in a billing data collection subsystem of a telecommunications operations system and other continuously-running systems and scientific applications in AT&T are described.<>
Keywords
Unix; client-server systems; financial data processing; invoicing; operating systems (computers); software engineering; software fault tolerance; telecommunication computing; AT&T; UNIX platform; application restart; application termination; billing data collection subsystem; checkpoint; clean internal state; client-server application; continuously-running systems; costs; downtime; idle time; long-running computation-intensive application; reusable module; scientific applications; server process rejuvenation; service availability; software rejuvenation; telecommunications operations system; threshold conditions; Application software; Availability; Computer bugs; Costs; Databases; Debugging; Fault tolerance; Software systems; Telecommunications; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Fault-Tolerant Computing, 1995. FTCS-25. Digest of Papers., Twenty-Fifth International Symposium on
Conference_Location
Pasadena, CA, USA
Print_ISBN
0-8186-7079-7
Type
conf
DOI
10.1109/FTCS.1995.466961
Filename
466961
Link To Document