DocumentCode
571800
Title
Extracting weblog of Siam University for learning user behavior on MapReduce
Author
Premchaiswadi, Wichian ; Romsaiyud, Walisa
Author_Institution
Grad. Sch. of Inf. Technol., Siam Univ., Bangkok, Thailand
Volume
1
fYear
2012
fDate
12-14 June 2012
Firstpage
149
Lastpage
154
Abstract
MapReduce is a framework that allows developers to write applications that rapidly process and analyze large volumes of data in a massively parallel scale. Moreover, a clickstream is a record of a user´s activity on the Internet. Using a clickstream analysis we can collect, analyze, and report aggregate data about which pages visitors visit in what order - and which are the result of the succession of mouse clicks each visitor makes. Clickstream analysis can reveal usage patterns leading to a heightened understanding of users´ behavior. In this paper, we introduced a novel and efficient web log mining model for web users clustering. In general, our model consists of three main steps; 1) Computing the similarity measure of any path in a web page, 2) Defining the k-mean clustering for group customerID 3) Generating the report based on the Hadoop MapReduce Framework. Consequently, our experiments were run on real world data derived from weblogs of Siam University at Bangkok, Thailand (www.siam.edu).
Keywords
Web services; computer aided instruction; data analysis; data mining; information retrieval; parallel programming; pattern clustering; public domain software; storage management; Hadoop; Internet; MapReduce; Web log mining model; Web page; Web user clustering; WebLog extraction; clickstream analysis; data analysis; data processing; group customerID; k-mean clustering; similarity measure computing; user behavior learning; Artificial intelligence; Computational modeling; Data mining; Educational institutions; File systems; Web pages; Web servers; Clickstream Data; Hadoop Distributed File System (HDFS); K-mean Clustering; MapReduce; Web log Analytics;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent and Advanced Systems (ICIAS), 2012 4th International Conference on
Conference_Location
Kuala Lumpur
Print_ISBN
978-1-4577-1968-4
Type
conf
DOI
10.1109/ICIAS.2012.6306177
Filename
6306177
Link To Document