DocumentCode :
2052255
Title :
On-the-fly kernel updates for high-performance computing clusters
Author :
Makris, Kristis ; Ryu, Kyung Dong
Author_Institution :
Dept. of Comput. Sci. & Eng., Arizona State Univ., Tempe, AZ
fYear :
2006
fDate :
25-29 April 2006
Abstract :
High-performance computing clusters running long-lived tasks currently cannot have kernel software updates applied to them without causing system downtime. These clusters miss opportunities for increased performance via specialized kernel support, cannot benefit from new kernel features, and continue to operate with kernel security holes unpatched, at least until the next scheduled maintenance date. We developed a system enabling dynamic kernel updates in parallel computing clusters to address these problems. Our system, DynAMOS, is founded on execution flow high-jacking through function cloning. It enables commodity operating systems popularly used in clusters gain adaptive and mutative capabilities. To demonstrate the efficacy of our system, we illustrate our experience in dynamically updating and extending a Linux cluster. We introduce adaptive memory paging for efficient gang-scheduling; extend the kernel´s process scheduler to support unobtrusive fine-grain cycle stealing, apply public security fixes, and inject performance monitoring functionality to a selection of kernel functions. Our benchmarks show that the overhead imposed by DynAMOS is mostly in the range of 1-8% for common Linux kernel functions
Keywords :
operating system kernels; scheduling; software maintenance; workstation clusters; DynAMOS; Linux cluster; adaptive memory paging; commodity operating systems; execution flow high-jacking; fine-grain cycle stealing; function cloning; gang scheduling; high-performance computing clusters; kernel software updates; parallel dynamic kernel updates; performance monitoring; public security fixes; Application software; Cloning; Instruments; Kernel; Linux; Magnetohydrodynamic power generation; Operating systems; Parallel processing; Processor scheduling; Security;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International
Conference_Location :
Rhodes Island
Print_ISBN :
1-4244-0054-6
Type :
conf
DOI :
10.1109/IPDPS.2006.1639690
Filename :
1639690
Link To Document :
بازگشت