Title :
The design and implementation of a reliable distributed operating system-ROSE
Author_Institution :
Dept. of Comput. Sci., Illinois Univ., Urbana-Champaign, IL, USA
Abstract :
ROSE, a modular distributed operating system that provides support for building reliable applications, is designed and implemented. Failure detection capabilities are provided by a failure detection server. Configuration objects can be used to capture the relationship among multiple processes that cooperate to replicate certain resources. Replicated address space (RAS) objects, whose content is accessible with a high probability despite hardware failures, can be used to increase data availability. Finally, a resistant process (RP) abstraction allows user processes to survive hardware failures with minimal interruption. Two different implementations of RP are provided: one checkpoints the information about its state in an RAS object periodically; the other uses replicated execution by executing the same code in different nodes at the same time
Keywords :
distributed processing; fault tolerant computing; network operating systems; ROSE; design; failure detection; implementation; reliable distributed operating system; replicated address space objects; resistant process; Application software; Buildings; Computer applications; Computer science; Distributed computing; Hardware; Operating systems; Protocols; Software algorithms; Writing;
Conference_Titel :
Reliable Distributed Systems, 1990. Proceedings., Ninth Symposium on
Conference_Location :
Huntsville, AL
Print_ISBN :
0-8186-2081-1
DOI :
10.1109/RELDIS.1990.93946