DocumentCode
1805907
Title
GEARSHIFT: Guaranteeing availability requirements in SLAs using hybrid fault tolerance
Author
Gonzalez, Andres J. ; Helvik, Bjarne E. ; Tiwari, Prakriti ; Becker, Denis M. ; Wittner, Otto J.
Author_Institution
Res. Dept., Telenor ASA, Fornebu, Norway
fYear
2015
fDate
April 26 2015-May 1 2015
Firstpage
1373
Lastpage
1381
Abstract
The dependability of ICT systems is vital for today´s society. However, operational systems are not fault free. Providers and customers have to define clear availability requirements and penalties on the delivered services by using SLAs. Fulfilling the stipulated availability may be expensive. The lack of mechanisms that allow a fine control of the SLA risk may lead to over-dimension the provided resources. Therefore, a relevant question for ICT service providers is: How to guarantee the SLA availability in a cost efficient way? This paper studies how to combine different fault tolerant techniques with different costs and properties, in order to economically fulfill a given SLA requirement. GEARSHIFT is a mechanism that enables ICT providers to set the fault tolerance technique (gear ratio) needed, depending on the current service conditions and requirements. We illustrate how to use the proposed model in a backbone network scenario, using measurements from a production national network. Finally, we show that the total costs of delivering an ICT service follow a simple convex function, which allows an easy selection of the optimal risk by tuning properly the combination of fault tolerant techniques.
Keywords
contracts; convex programming; costing; fault tolerant computing; risk management; GEARSHIFT; ICT service delivery cost; ICT service provider; ICT system dependability; SLA availability; SLA risk; availability requirement guarantee; backbone network scenario; convex function; cost efficiency; fault tolerant technique; hybrid fault tolerance; operational systems; optimal risk selection; production national network; service condition; service requirement; Approximation methods; Computers; Conferences; Convolution; Fault tolerance; Fault tolerant systems; Switches; SLA; accumulated downtime; fault tolerance; network recovery; renewal theory; risk optimization;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Communications (INFOCOM), 2015 IEEE Conference on
Conference_Location
Kowloon
Type
conf
DOI
10.1109/INFOCOM.2015.7218514
Filename
7218514
Link To Document