DocumentCode :
2649740
Title :
High availability of the memory hierarchy in a cluster
Author :
Morin, Christine ; Lottiaux, Renaud ; Kermarrec, Anne-Marie
Author_Institution :
IRISA/INRIA, Campus Univ. de Beaulieu, Rennes, France
fYear :
2000
fDate :
2000
Firstpage :
134
Lastpage :
143
Abstract :
A single-level store (SLS) integrating a shared virtual memory and a parallel file system with file mapping as its interface is attractive for the execution of high-performance applications in a cluster. However, the probability of a node reboot or failure is quite high. In this paper, we present the design of a highly available SLS system. Our approach combines checkpointing in memory and permanent checkpointing on disk in a cluster using all cluster memory and disk resources. Preliminary performance results show the applicability of the proposed approach for parallel applications with huge input/output requirements
Keywords :
parallel memories; performance evaluation; shared memory systems; system recovery; virtual storage; workstation clusters; cluster computing; cluster disk resources; cluster memory resources; file mapping interface; high-performance applications; highly available system; input/output requirements; memory checkpointing; memory hierarchy availability; node failure; node reboot; parallel applications; parallel file system; performance; permanent on-disk checkpointing; shared virtual memory; single-level store; Availability; Bandwidth; Bit error rate; Checkpointing; Fault tolerance; File systems; Laser sintering; Memory management; Microprocessors; Support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Reliable Distributed Systems, 2000. SRDS-2000. Proceedings The 19th IEEE Symposium on
Conference_Location :
Nurnberg
Print_ISBN :
0-7695-0543-0
Type :
conf
DOI :
10.1109/RELDI.2000.885401
Filename :
885401
Link To Document :
بازگشت