Title :
Mining Web log data based on key path
Author :
Song, Ai-Bo ; Liang, Zuo-Peng ; Zhao, Mao-Xian ; Dong, Yi-Sheng
Author_Institution :
Dept. of Comput. Sci. & Eng., Southeast Univ., Nanjing, China
Abstract :
A Web log mining method is presented. First, minimal key path set (MKPS) is defined and an algorithm to find the MKPS online is given. At the same time, for any key path in the MPKS, this algorithm can find out all transactions relevant to it. After scanning the transaction database only once, a relevant matrix is set up, where the key paths in MKPS are taken as columns and the transactions are taken as rows. Compared to previous methods, our method considers the three major features of users´ accessing the Web: ordinal, contiguous, and duplicate. Moreover, for clustering transactions, we have lesser dimensions than the previous method. Using the clustering algorithm based on the relevant matrix, better clustering results will be obtained more precisely and quickly. Experiments show the effectiveness of the method.
Keywords :
Web sites; data mining; pattern clustering; tree data structures; Web log data; Web log mining; clustering algorithm; clustering transactions; contiguous features; duplicate features; minimal key path set; ordinal features; transaction database; Clustering algorithms; Computer science; Data engineering; Data mining; Educational technology; Electronic mail; Lungs; Software libraries; Transaction databases; Web pages;
Conference_Titel :
Machine Learning and Cybernetics, 2002. Proceedings. 2002 International Conference on
Print_ISBN :
0-7803-7508-4
DOI :
10.1109/ICMLC.2002.1176728