DocumentCode :
2184136
Title :
DSM-TKP: mining top-k path traversal patterns over Web click-streams
Author :
Li, Hua-Fu ; Lee, Suh-Yin ; Shan, Man-Kwan
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., National Chiao-Tung Univ., Hsinchu, Taiwan
fYear :
2005
fDate :
19-22 Sept. 2005
Firstpage :
326
Lastpage :
329
Abstract :
Online, single-pass mining Web click streams poses some interesting computational issues, such as unbounded length of streaming data, possibly very fast arrival rate and just one scan over previously arrived click-sequencer In this paper, we propose a new, single-pass algorithm, called DSM-TKP (data stream mining for top-k path traversal patterns), for mining top-k path traversal patterns, where k is the desired number of path traversal patterns to be mined. An effective summary data structure called TKP-forest (top-k path forest) is used to maintain the essential information about the top-k path traversal patterns of the click-stream so far. Experimental studies show that DSM-TKP algorithm uses stable memory usage and makes only one pass over the streaming data.
Keywords :
Internet; data mining; data structures; tree searching; Web click-stream; data stream mining; memory usage; single-pass algorithm; summary data structure; top-k path forest; top-k path traversal pattern; Computer science; Data engineering; Data mining; Data models; Data structures; Databases; Itemsets; Measurement; Monitoring; Telecommunication traffic;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence, 2005. Proceedings. The 2005 IEEE/WIC/ACM International Conference on
Print_ISBN :
0-7695-2415-X
Type :
conf
DOI :
10.1109/WI.2005.56
Filename :
1517866
Link To Document :
بازگشت