DocumentCode :
928869
Title :
Advanced data preprocessing for intersites Web usage mining
Author :
Tanasa, Doru ; Trousse, Brigitte
Author_Institution :
AxIS Project Team, INRIA, Sophia Antipolis, France
Volume :
19
Issue :
2
fYear :
2004
Firstpage :
59
Lastpage :
65
Abstract :
Web usage mining applies data mining procedures to analyze user access of Web sites. As with any KDD (knowledge discovery and data mining) process, WUM contains three main steps: preprocessing, knowledge extraction, and results analysis. We focus on data preprocessing, a fastidious, complex process. Analysts aim to determine the exact list of users who accessed the Web site and to reconstitute user sessions-the sequence of actions each user performed on the Web site. Intersites WUM deals with Web server logs from several Web sites, generally belonging to the same organization. Thus, analysts must reassemble the users´ path through all the different Web servers that they visited. Our solution is to join all the log files and reconstitute the visit. Classical data preprocessing involves three steps: data fusion, data cleaning, and data structuration. Our solution for WUM adds what we call advanced data preprocessing. This consists of a data summarization step, which will allow the analyst to select only the information of interest. We´ve successfully tested our solution in an experiment with log files from INRIA Web sites.
Keywords :
Internet; Web design; data mining; user interfaces; INRIA Web site; KDD knowledge discovery; WUM; Web server log; Web usage mining; data cleaning; data fusion; data mining procedure; data structuration; data summarization; intersite; knowledge extraction; user access analysis; Cleaning; Data analysis; Data mining; Data preprocessing; Displays; Performance analysis; Web page design; Web pages; Web server; Web sites;
fLanguage :
English
Journal_Title :
Intelligent Systems, IEEE
Publisher :
ieee
ISSN :
1541-1672
Type :
jour
DOI :
10.1109/MIS.2004.1274912
Filename :
1274912
Link To Document :
بازگشت