Title :
Robust parallel resource management in shared memory multiprocessor systems
Author :
Yen, I-Ling ; Bastani, Farokh B.
Author_Institution :
Dept. of Comput. Sci., Michigan State Univ., East Lansing, MI, USA
Abstract :
Parallel machines are being increasingly used for applications that require both quick response time and high reliability. This poses a challenge in programming these systems since it must be ensured that there is sufficient redundancy to cope with failures and that, at the same time, redundant components are used effectively during failure free periods to enhance the performance. Among the issues, resource management in such systems is highly critical to the robustness and efficiency of the system. A good resource management algorithm should allow the system to continue its operation even in the presence of a significant number of processor failures. Also, the incorporation of fault tolerance should not incur too much overhead. In this paper, we develop two robust resource management algorithms which simultaneously achieve the twin objectives of low overhead and high reliability
Keywords :
fault tolerant computing; redundancy; resource allocation; shared memory systems; failures; fault tolerance; high reliability; low overhead; parallel resource management; redundancy; resource management; shared memory multiprocessor systems; Application software; Computer science; Data structures; Delay; Fault tolerant systems; Memory management; Multiprocessing systems; Parallel machines; Resource management; Robustness;
Conference_Titel :
Parallel Processing Symposium, 1995. Proceedings., 9th International
Conference_Location :
Santa Barbara, CA
Print_ISBN :
0-8186-7074-6
DOI :
10.1109/IPPS.1995.395971