Title :
Processor failure recovery for a resource sharing algorithm
Author :
Newman, I.A. ; Stallard, R.P. ; Woodward, M.C.
Author_Institution :
Loughborough University of Technology, Department of Computer Studies, Loughborough, UK
fDate :
3/1/1986 12:00:00 AM
Abstract :
With the increase in popularity of distributed computer systems, the reliability of the system as a whole is becoming more important. A recently published combined resource sharing algorithm showed how the atomic operations required for resource management in a closely coupled multiprocessor system could be provided. The paper describes a recovery system that may be incorporated within the earlier algorithm to enable continued and correct operation of the system despite the failure of one or more component processors. A distributed simulation of the recovery mechanism is described and results from simulation runs are presented.
Keywords :
distributed processing; fault tolerant computing; combined resource sharing algorithm; component processors; distributed computer systems; distributed simulation; multiprocessor system; processor failure recovery; resource management; resource sharing algorithm;
Journal_Title :
Computers and Digital Techniques, IEE Proceedings E
DOI :
10.1049/ip-e.1986.0008