• DocumentCode
    1831351
  • Title

    Proactive Failure Management by Integrated Unsupervised and Semi-Supervised Learning for Dependable Cloud Systems

  • Author

    Guan, Qiang ; Zhang, Ziming ; Fu, Song

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of North Texas, Denton, TX, USA
  • fYear
    2011
  • fDate
    22-26 Aug. 2011
  • Firstpage
    83
  • Lastpage
    90
  • Abstract
    Cloud computing systems continue to grow in their scale and complexity. They are changing dynamically as well due to the addition and removal of system components, changing execution environments, frequent updates and upgrades, online repairs and more. In such large-scale complex and dynamic systems, failures are common. In this paper, we present a failure prediction mechanism exploiting both unsupervised and semi-supervised learning techniques for building dependable cloud computing systems. The unsupervised failure detection method uses an ensemble of Bayesian models. It characterizes normal execution states of the system and detects anomalous behaviors. After the anomalies are verified by system administrators, labeled data are available. Then, we apply supervised learning based on decision tree classier to predict future failure occurrences in the cloud. Experimental results in an institute-wide cloud computing system show that our proposed method can forecast failure dynamics with high accuracy.
  • Keywords
    Bayes methods; cloud computing; decision trees; pattern classification; system recovery; unsupervised learning; Bayesian models; decision tree classier; dependable cloud computing systems; failure prediction mechanism; proactive failure management; semi-supervised learning; unsupervised failure detection method; unsupervised learning; Bayesian methods; Cloud computing; Data models; Decision trees; Mathematical model; Monitoring; Mutual information; Bayesian detector; Cloud systems; Decision tree; Dependable systems; Learning algorithms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Availability, Reliability and Security (ARES), 2011 Sixth International Conference on
  • Conference_Location
    Vienna
  • Print_ISBN
    978-1-4577-0979-1
  • Electronic_ISBN
    978-0-7695-4485-4
  • Type

    conf

  • DOI
    10.1109/ARES.2011.20
  • Filename
    6045942