• DocumentCode
    1805907
  • Title

    GEARSHIFT: Guaranteeing availability requirements in SLAs using hybrid fault tolerance

  • Author

    Gonzalez, Andres J. ; Helvik, Bjarne E. ; Tiwari, Prakriti ; Becker, Denis M. ; Wittner, Otto J.

  • Author_Institution
    Res. Dept., Telenor ASA, Fornebu, Norway
  • fYear
    2015
  • fDate
    April 26 2015-May 1 2015
  • Firstpage
    1373
  • Lastpage
    1381
  • Abstract
    The dependability of ICT systems is vital for today´s society. However, operational systems are not fault free. Providers and customers have to define clear availability requirements and penalties on the delivered services by using SLAs. Fulfilling the stipulated availability may be expensive. The lack of mechanisms that allow a fine control of the SLA risk may lead to over-dimension the provided resources. Therefore, a relevant question for ICT service providers is: How to guarantee the SLA availability in a cost efficient way? This paper studies how to combine different fault tolerant techniques with different costs and properties, in order to economically fulfill a given SLA requirement. GEARSHIFT is a mechanism that enables ICT providers to set the fault tolerance technique (gear ratio) needed, depending on the current service conditions and requirements. We illustrate how to use the proposed model in a backbone network scenario, using measurements from a production national network. Finally, we show that the total costs of delivering an ICT service follow a simple convex function, which allows an easy selection of the optimal risk by tuning properly the combination of fault tolerant techniques.
  • Keywords
    contracts; convex programming; costing; fault tolerant computing; risk management; GEARSHIFT; ICT service delivery cost; ICT service provider; ICT system dependability; SLA availability; SLA risk; availability requirement guarantee; backbone network scenario; convex function; cost efficiency; fault tolerant technique; hybrid fault tolerance; operational systems; optimal risk selection; production national network; service condition; service requirement; Approximation methods; Computers; Conferences; Convolution; Fault tolerance; Fault tolerant systems; Switches; SLA; accumulated downtime; fault tolerance; network recovery; renewal theory; risk optimization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Communications (INFOCOM), 2015 IEEE Conference on
  • Conference_Location
    Kowloon
  • Type

    conf

  • DOI
    10.1109/INFOCOM.2015.7218514
  • Filename
    7218514