DocumentCode :
3335999
Title :
Gracefully degrading systems using the bulk-synchronous parallel model with randomised shared memory
Author :
Savva, A. ; Nanya, T.
Author_Institution :
Dept. of Comput. Sci., Tokyo Inst. of Technol., Japan
fYear :
1995
fDate :
27-30 June 1995
Firstpage :
299
Lastpage :
308
Abstract :
The bulk-synchronous parallel model (BSPM) was proposed as a bridging model for parallel computation by Valiant (1990). By using randomised shared memory (RSM), this model offers an asymptotically optimal emulation of the PRAM. By using the BSPM with RSM, we show how a gracefully degrading massively parallel system can be obtained through: memory duplication to ensure global memory integrity, and to speed up the reconfiguration; a global reconfiguration method that restores the logical properties of the system, after a fault occurs. We assume fail-stop processors, single faults, no spare processors, and no significant loss of network throughput as a result of faults. Work done during reconfiguration is shared equally among the live processors, with minimal coordination. The overhead of the scheme and the graceful degradation achieved depend on the program being executed. We evaluate the reconfiguration, overhead, and graceful degradation of the system experimentally.<>
Keywords :
fault tolerant computing; memory architecture; parallel architectures; parallel machines; random-access storage; reconfigurable architectures; shared memory systems; PRAM; asymptotically optimal emulation; bulk-synchronous parallel model; fail-stop processors; fault occurrence; global memory integrity; global reconfiguration method; gracefully degrading systems; logical property restoration; massively parallel system; memory duplication; network throughput; overhead; parallel computation; parallel random access machine; randomised shared memory; Computational modeling; Computer science; Concurrent computing; Degradation; Emulation; Hardware; Phase change random access memory; Power system modeling; Redundancy; Throughput;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fault-Tolerant Computing, 1995. FTCS-25. Digest of Papers., Twenty-Fifth International Symposium on
Conference_Location :
Pasadena, CA, USA
Print_ISBN :
0-8186-7079-7
Type :
conf
DOI :
10.1109/FTCS.1995.466969
Filename :
466969
Link To Document :
بازگشت