مرکز منطقه ای اطلاع رساني علوم و فناوري - Server recovery using naturally replicated state: a case study

DocumentCode :

1671032

Title :

Server recovery using naturally replicated state: a case study

Author :

Devarakonda, Murthy ; Kish, Bill ; Mohindra, Ajay

Author_Institution :

IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

fYear :

1995

Firstpage :

213

Lastpage :

220

Abstract :

This paper describes design and preliminary measurements of a file server recovery scheme that uses naturally replicated state among clients. This scheme, implemented in the Calypso file system, is truly transparent to the user and avoids the overhead of explicit replication. A three-phase protocol reconstructs the server state either on a backup node (if disks are multi-ported) or on the rebooted server node. Measurements show that the recovery time is about 21 seconds for a busy 10-node cluster. However, the time to rebuild the distributed state is only about 1.5 seconds, and most of the recovery time is spent in replaying the write-ahead log of the underlying file system. Fortunately, the log redo time is bounded by the log size

Keywords :

client-server systems; computer network reliability; file servers; protocols; software fault tolerance; system recovery; Calypso file system; explicit replication; log redo time; naturally replicated state; rebooted server node; server recovery; three-phase protocol; underlying file system; write-ahead log; Computer aided software engineering; Costs; File servers; File systems; Maintenance engineering; Permission; Protocols; Size measurement; Testing; Time measurement;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Distributed Computing Systems, 1995., Proceedings of the 15th International Conference on

Conference_Location :

Vancouver, BC

ISSN :

1063-6927

Print_ISBN :

0-8186-7025-8

Type :

conf

DOI :

10.1109/ICDCS.1995.500022

Filename :

500022

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1671032