DocumentCode :
3249847
Title :
Mass log data processing and mining based on Hadoop and cloud computing
Author :
Yu, Hongyong ; Wang, Deshuai
Author_Institution :
State Key Lab. of Software Archit., Neusoft Corp., Shenyang, China
fYear :
2012
fDate :
14-17 July 2012
Firstpage :
197
Lastpage :
202
Abstract :
With the rapid development of the Internet, SaaS applications delivered as services through internet become an important alternative of traditional software. While using the services, users need real time usage information, and they also need to dig out useful knowledge. As a result, data processing and data mining techniques are designed to cope with such problems, and using log data is an effective method to record the SaaS usage information in a standard format. However, as the size of data grows, traditional distributed log data processing systems are not able to processing massive log data from SaaS applications with millions of users. This paper proposes a mass log data processing and data mining methods based on Hadoop to achieve scalability and performance. The model, process, architecture, and implementation of the data processing and mining methods are proposed, and the experimental results is shown and analyzed to prove the effectiveness of the methods.
Keywords :
cloud computing; data mining; distributed processing; Hadoop computing; Internet; SaaS applications; cloud computing; distributed log data processing systems; mass log data mining; mass log data processing; Algorithm design and analysis; Data mining; Data processing; Distributed databases; Real time systems; Servers; Hadoop; business intelligence; data mining; mass data processing; real time statistics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science & Education (ICCSE), 2012 7th International Conference on
Conference_Location :
Melbourne, VIC
Print_ISBN :
978-1-4673-0241-8
Type :
conf
DOI :
10.1109/ICCSE.2012.6295056
Filename :
6295056
Link To Document :
بازگشت