• DocumentCode
    1670170
  • Title

    An agent-based distributed monitoring framework (Extended abstract)

  • Author

    Yanhaona, Muhammad N. ; Prodhan, Anindya T. ; Grimshaw, Andrew S.

  • Author_Institution
    Univ. of Virginia, Charlottesville, VA, USA
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    10
  • Abstract
    In compute clusters, monitoring of infrastructure and application components is essential for performance assessment, failure detection, problem forecasting, better resource allocation, and several other reasons. Present day trends towards larger and more heterogeneous clusters, rise of virtual data-centers, and greater variability of usage suggest that we have to rethink how we do monitoring. We need solutions that will remain scalable in the face of unforeseen expansions, can work in a wide-range of environments, and be adaptable to changes of requirements. We have developed an agent-based framework for constructing such monitoring solutions. Our framework deals with all scalability and flexibility issues associated with monitoring and leaves only the use-case specific task of data generation to the specific solution. This separation of concerns provides a versatile design that enables a single monitoring solution to work in a range of environments; and, at the same time, enables a range of monitoring solutions exhibiting different behaviors to be constructed by varying the tunable parameters of the framework. This paper presents the design, implementation, and evaluation of our novel framework.
  • Keywords
    computer centres; distributed processing; multi-agent systems; pattern clustering; system monitoring; agent-based distributed monitoring framework; application components; data generation; failure detection; heterogeneous clusters; infrastructure monitoring; performance assessment; problem forecasting; resource allocation; virtual data-centers; Fault tolerance; Heart beat; Monitoring; Quality of service; Receivers; Routing; Scalability; autonomous systems; cluster monitoring; distributed systems; flexibility; scalability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Networking Systems and Security (NSysS), 2015 International Conference on
  • Conference_Location
    Dhaka
  • Print_ISBN
    978-1-4799-8125-0
  • Type

    conf

  • DOI
    10.1109/NSysS.2015.7043515
  • Filename
    7043515