• DocumentCode
    258397
  • Title

    An Empirical Failure-Analysis of a Large-Scale Cloud Computing Environment

  • Author

    Garraghan, Peter ; Townend, Paul ; Jie Xu

  • Author_Institution
    Sch. of Comput., Univ. of Leeds, Leeds, UK
  • fYear
    2014
  • fDate
    9-11 Jan. 2014
  • Firstpage
    113
  • Lastpage
    120
  • Abstract
    Cloud computing research is in great need of statistical parameters derived from the analysis of real-world systems. One aspect of this is the failure characteristics of Cloud environments composed of workloads and servers, currently, few metrics are available that quantify failure and repair times of workloads and servers at a large-scale. Workload metrics in particular are critical for characterizing and modeling accurate workload behavior, enabling more realistic workload simulation and failure scenarios of systems. This paper presents the analysis of failure data of a large-scale production Cloud environment (consisting of over 12,500 servers), and includes a study of failure and repair times and characteristics for both Cloud workloads and servers. Our results show that failure characteristics for workload and servers are highly variable and that production Cloud workloads can be accurately modeled by a Gamma distribution. Repair times range between 30 seconds to 4 days, and 25 minutes to 8 days, for workloads and servers respectively.
  • Keywords
    cloud computing; gamma distribution; Gamma distribution; cloud workloads; empirical failure-analysis; large-scale cloud computing environment; real-world systems; statistical parameters; workload metrics; workload simulation; Cloud computing; Computer crashes; Hardware; Maintenance engineering; Production; Servers; Cloud computing; Dependability; Failure analysis; Repair analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High-Assurance Systems Engineering (HASE), 2014 IEEE 15th International Symposium on
  • Conference_Location
    Miami Beach, FL
  • Print_ISBN
    978-1-4799-3465-2
  • Type

    conf

  • DOI
    10.1109/HASE.2014.24
  • Filename
    6754595