Title :
Software exploitation of a fault-tolerant computer with a large memory
Author :
Eskesen, F. ; Hack, M. ; Iyengar, A. ; King, R.P. ; Halim, N.
Author_Institution :
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
The DM/6000 hardware (a prototype, fault-tolerant RS/6000 built at the T.J. Watson Research Center) provides fault tolerance and a large, nonvolatile main memory. Running a commercial, general-purpose operating system on it, of itself, does nothing to increase software availability. In fact, the time to rebuild the contents of a large memory may decrease availability. We describe our techniques for hiding most of the main memory, which requires the operating system to access it only by way of services separate from the operating system. This can allow the memory and those access services to achieve much higher availability, which, in turn, increases the availability of the system as a whole. We also performed simulation studies to determine those conditions where this system organization can lead to improved performance for recoverable database applications.
Keywords :
cache storage; fault tolerant computing; multiprocessing systems; operating systems (computers); software performance evaluation; DM/6000; RS/6000; fault-tolerant computer; general-purpose operating system; large nonvolatile main memory; multiprocessor; performance; recoverable database applications; simulation; software availability; Application software; Availability; Computer hacking; Databases; Fault tolerance; Fault tolerant systems; Hardware; Kernel; Nonvolatile memory; Operating systems;
Conference_Titel :
Fault-Tolerant Computing, 1998. Digest of Papers. Twenty-Eighth Annual International Symposium on
Conference_Location :
Munich, Germany
Print_ISBN :
0-8186-8470-4
DOI :
10.1109/FTCS.1998.689484