DocumentCode
2626965
Title
STAR: a fault-tolerant system for distributed applications
Author
Sens, Pierre ; Folliot, Bertil
Author_Institution
IBP, Paris VI Univ., Paris, France
fYear
1993
fDate
1-4 Dec 1993
Firstpage
656
Lastpage
660
Abstract
The paper presents a fault-tolerant manager for distributed applications. This manager provides an efficient recovery of hosts´ failures on networks of workstations. An independent checkpointing is used to automatically recover application processes affected by host failures. Domino-effects are avoided by means of message logging and file versions management. STAR provides an efficient software failure detection by structuring hosts in a logical ring. Performance measurements in a real environment show the interest and the limits of our system
Keywords
computer network reliability; fault tolerant computing; local area networks; reliability; software fault tolerance; system recovery; LAN; STAR; distributed applications; failure recovery; fault-tolerant manager; fault-tolerant system; file versions management; independent checkpointing; message logging; performance measurements; real environment; software failure detection; system recovery; workstation networks; Application software; Checkpointing; Delay; Fault detection; Fault tolerance; Fault tolerant systems; Laboratories; Operating systems; Resource management; Workstations;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Processing, 1993. Proceedings of the Fifth IEEE Symposium on
Conference_Location
Dallas, TX
Print_ISBN
0-8186-4222-X
Type
conf
DOI
10.1109/SPDP.1993.395471
Filename
395471
Link To Document