DocumentCode :
750964
Title :
Quantifying the performability of cluster-based services
Author :
Nagaraja, Kiran ; Gama, Gustavo ; Bianchini, Ricardo ; Martin, Richard P. ; Meira, Wagner, Jr. ; Nguyen, Thu D.
Author_Institution :
Dept. of Comput. Sci., Rutgers Univ., New Brunswick, NJ, USA
Volume :
16
Issue :
5
fYear :
2005
fDate :
5/1/2005 12:00:00 AM
Firstpage :
456
Lastpage :
467
Abstract :
In this paper, we propose a two-phase methodology for systematically evaluating the performability (performance and availability) of cluster-based Internet services. In the first phase, evaluators use a fault-injection infrastructure to characterize the service´s behavior in the presence of faults. In the second phase, evaluators use an analytical model to combine an expected fault load with measurements from the first phase to assess the service´s performability. Using this model, evaluators can study the service´s sensitivity to different design decisions, fault rates, and other environmental factors. To demonstrate our methodology, we study the performability of a multitier Internet service. In particular, we evaluate the performance and availability of three soft state maintenance strategies for an online bookstore service in the presence of seven classes of faults. Among other interesting results, we clearly isolate the effect of different faults, showing that the tier of Web servers is responsible for an often dominant fraction of the service unavailability. Our results also demonstrate that storing the soft state in a database achieves better performability than storing it in main memory (even when the state is efficiently replicated) when we weight performance and availability equally. Based on our results, we conclude that service designers may want an unbalanced system in which they heavily load highly available components and leave more spare capacity for components that are likely to fail more often.
Keywords :
Internet; fault tolerance; performance evaluation; workstation clusters; Internet service; Web server; analytical model; cluster-based service; database achieves; fault-injection infrastructure; main memory; online bookstore service; performance availability; soft state maintenance strategy; storing; Analytical models; Availability; Computer Society; Environmental factors; Performance analysis; Performance evaluation; Phase measurement; Scalability; Space exploration; Web and internet services; Internet services.; Performance; availability; fault tolerance;
fLanguage :
English
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9219
Type :
jour
DOI :
10.1109/TPDS.2005.61
Filename :
1411733
Link To Document :
بازگشت