DocumentCode :
3144659
Title :
Analysis farm: A cloud-based scalable aggregation and query platform for network log analysis
Author :
Wei, Jianwen ; Zhao, Yusu ; Jiang, Kaida ; Xie, Rui ; Jin, Yaohui
fYear :
2011
fDate :
12-14 Dec. 2011
Firstpage :
354
Lastpage :
359
Abstract :
Network monitoring data provides insights into the network operation status. With increasingly sophisticated ways of probing, sampling and recording network activities, the huge amount of monitoring data brings both an opportunity and a challenge for network data analysis. We aim to build a scalable platform, named Analysis Farm, for analyzing network logs. Analysis Farm´s targets include fast log aggregation and agile log query. To achieve these goals, storage scalability, computation scalability and query agility should be addressed. The cloud computing and NoSQL technologies meet our needs by providing manageable on-demand hardware resources and novel data storage models. We choose OpenStack, an open-source cloud tool set, for resource provisioning, and MongoDB, a RDBMS-like document-oriented NoSQL system, for log storage and analysis. By combining scalability at both OpenStack and MongoDB, we build Analysis Farm capable of storage scale-out, computation scale-out and agile query. The Analysis Farm prototype in use, consisting of 10 MongoDB servers, aggregates about 3 million log records in a 10-minute interval and handle ad hoc query effectively in the log database accumulated with more than 400 million records per day. In this paper, we describe Analysis Farm´s background, targets, architecture and some experimental results. We believe Analysis Farm will benefit those who work on big-log-style data analysis.
Keywords :
SQL; cloud computing; data analysis; data loggers; query processing; storage management; Analysis Farm prototype; MongoDB servers; NoSQL technology; OpenStack; RDBMS-like document-oriented NoSQL system; ad hoc query; agile log query; agile query; analysis farm; big-log-style data analysis; cloud computing; cloud-based scalable aggregation; computation scalability; computation scale-out; data monitoring; data storage models; fast log aggregation; log database; log records; log storage; manageable ondemand hardware resources; network data analysis; network log analysis; network logs; network monitoring data; network operation status; open-source cloud tool set; query agility; query platform; resource provisioning; storage scalability; storage scale-out; Cloud computing; Engines; Hardware; IP networks; Indexes; Scalability; Servers; Big Data; Cloud computing; Log Analysis; NoSQL;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cloud and Service Computing (CSC), 2011 International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4577-1635-5
Electronic_ISBN :
978-1-4577-1636-2
Type :
conf
DOI :
10.1109/CSC.2011.6138547
Filename :
6138547
Link To Document :
بازگشت