• DocumentCode
    2848817
  • Title

    Network-based problem detection for distributed systems

  • Author

    Kashima, Hisashi ; Tsumura, Tadashi ; Idé, Tsuyoshi ; Nogayama, Takahide ; Hirade, Ryo ; Etoh, Hiroaki ; Fukuda, Takeshi

  • Author_Institution
    IBM Tokyo Res. Lab., Kanagawa, Japan
  • fYear
    2005
  • fDate
    5-8 April 2005
  • Firstpage
    978
  • Lastpage
    989
  • Abstract
    We introduce a network-based problem detection framework for distributed systems, which includes a data-mining method for discovering dynamic dependencies among distributed services from transaction data collected from network, and a novel problem detection method based on the discovered dependencies. From observed containments of transaction execution time periods, we estimate the probabilities of accidental and non-accidental containments, and build a competitive model for discovering direct dependencies by using a model estimation method based on the online EM algorithm. Utilizing the discovered dependency information, we also propose a hierarchical problem detection framework, where microscopic dependency information is incorporated with a macroscopic anomaly metric that monitors the behavior of the system as a whole. This feature is made possible by employing a network-based design which provides overall information of the system without any impact on the performance.
  • Keywords
    data mining; distributed processing; performance evaluation; system monitoring; system recovery; data-mining method; distributed system; macroscopic anomaly metric; microscopic dependency information; network-based problem detection; online EM algorithm; system monitoring; Data mining; Distributed computing; Distributed databases; Failure analysis; Laboratories; Microscopy; Network servers; Spatial databases; Web pages; Web server;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference on
  • ISSN
    1084-4627
  • Print_ISBN
    0-7695-2285-8
  • Type

    conf

  • DOI
    10.1109/ICDE.2005.93
  • Filename
    1410209