Title :
Research of Massive Web Log Data Mining Based on Cloud Computing
Author :
Zhen Qi Wang ; Hai Long Li
Author_Institution :
Inf. & Network Manage. Center, North China Electr. Power Univ., Baoding, China
Abstract :
Internet data is massive, heterogeneous, dynamic, and data is increasingly complex. Through data mining and analysis, we are able to obtain potentially valuable information, but traditional data mining system has bottleneck in data storage and computing power. To solve the problem, by using technology of cloud computing, we design a massive web log data mining and analysis platform based on cloud Hadoop framework. At the same time, in order to improve the efficiency of mining, we realize the parallelization of Apriori algorithm for the massive Web log mining. Then we use the platform to verify the efficiency of Apriori algorithm which has been improved by parallelization.
Keywords :
cloud computing; data mining; Apriori algorithm; Internet data; cloud Hadoop framework; cloud computing; computing power; data analysis; data mining system; data storage; massive Web log data mining; Algorithm design and analysis; Association rules; Cloud computing; Clustering algorithms; Data preprocessing; Distributed databases; Apriori algorithm; Association rule mining; Cloud computing; Map/Reduce; Web log mining;
Conference_Titel :
Computational and Information Sciences (ICCIS), 2013 Fifth International Conference on
Conference_Location :
Shiyang
DOI :
10.1109/ICCIS.2013.162