DocumentCode
1703270
Title
Selecting a “primary partition” in partitionable asynchronous distributed systems
Author
Bartoli, Alberto ; Babaoglu, Ozalp
Author_Institution
Dipt. Ingegneria dell Inf., Pisa Univ., Italy
fYear
1997
Firstpage
138
Lastpage
145
Abstract
We consider network applications that are based on the process group paradigm. When such applications are deployed over networks that are subject to failures, they may partition across several disconnected clusters resulting in multiple views of the group´s current composition to exist concurrently. Application semantics determine which operations, if any, can be performed in different partitions without compromising consistency. For certain application classes, most (possibly all) operations need to be confined to a single primary partition while other partitions are allowed to service only a (possibly empty) subset of the operations. We propose a mechanism for deciding when a view constitutes the primary partition for the group. Our solution is highly flexible and has the following novel features: each group member can establish if it belongs to the primary partition or not, based solely on local information; the group can be dynamic as processes voluntarily join and leave it; the selection rule for establishing the primary partition need not be universal but can be decided on a per-application basis and can be modified at run time; the primary partition can be re-established even after total failures. Layering our solution on top of a partitionable group membership service allows a wide range of applications with different and possibly conflicting notions of “primary partition” to be supported on a common computing base
Keywords
computer network management; data integrity; fault tolerant computing; performance evaluation; reliability; application classes; application semantics; common computing base; disconnected clusters; group member; local information; multiple views; network applications; partitionable asynchronous distributed systems; partitionable group membership service; primary partition; process group paradigm; selection rule; single primary partition; total failures; Computer crashes; Computer science; Electronic mail; Intelligent networks; Partitioning algorithms; Runtime;
fLanguage
English
Publisher
ieee
Conference_Titel
Reliable Distributed Systems, 1997. Proceedings., The Sixteenth Symposium on
Conference_Location
Durham, NC
ISSN
1060-9857
Print_ISBN
0-8186-8177-2
Type
conf
DOI
10.1109/RELDIS.1997.632809
Filename
632809
Link To Document