Title :
Varanus: In Situ Monitoring for Large Scale Cloud Systems
Author :
Ward, J. Scott ; Barker, Adam
Author_Institution :
Sch. of Comput. Sci., Univ. of St. Andrews, St. Andrews, UK
Abstract :
Monitoring is an essential aspect of maintaining and developing computer systems which increases in difficulty proportional to the size of the system. The need for robust monitoring tools has become more evident with the advent of cloud computing. Infrastructure as a Service (IaaS) clouds allow end users to deploy vast numbers of virtual machines as part of dynamic and transient architectures. Current monitoring solutions, including many of those in the open-source domain, rely on outdated concepts including manual configuration and centralised data collection and adapt poorly to membership churn. In this paper we propose the development of a cloud monitoring system to provide scalable and robust lookup, data collection and analysis services for large-scale cloud systems. In lieu of centrally managed monitoring we propose a multi-tier architecture using a layered gossip protocol to aggregate monitoring information and facilitate lookup, information collection and the identification of redundant capacity. This allows for a resource aware data collection and storage architecture that operates over the system being monitored. This in turn enables monitoring to be done in situ without the need for significant additional infrastructure to facilitate monitoring services. We evaluate this approach against alternative monitoring paradigms and demonstrate how our solution is well adapted to usage in a cloud-computing context.
Keywords :
cloud computing; monitoring; IaaS clouds; Varanus; aggregate monitoring information; centralised data collection; cloud computing context; cloud monitoring system; computer systems; infrastructure as a service; large scale cloud systems; layered gossip protocol; membership churn; monitoring solutions; multitier architecture; open source domain; redundant capacity; resource aware data collection; storage architecture; transient architectures; virtual machines; Bandwidth; Cloud computing; Computer architecture; Data collection; Measurement; Monitoring; Protocols; Cloud computing; monitoring;
Conference_Titel :
Cloud Computing Technology and Science (CloudCom), 2013 IEEE 5th International Conference on
Conference_Location :
Bristol
DOI :
10.1109/CloudCom.2013.164