• DocumentCode
    2184136
  • Title

    DSM-TKP: mining top-k path traversal patterns over Web click-streams

  • Author

    Li, Hua-Fu ; Lee, Suh-Yin ; Shan, Man-Kwan

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., National Chiao-Tung Univ., Hsinchu, Taiwan
  • fYear
    2005
  • fDate
    19-22 Sept. 2005
  • Firstpage
    326
  • Lastpage
    329
  • Abstract
    Online, single-pass mining Web click streams poses some interesting computational issues, such as unbounded length of streaming data, possibly very fast arrival rate and just one scan over previously arrived click-sequencer In this paper, we propose a new, single-pass algorithm, called DSM-TKP (data stream mining for top-k path traversal patterns), for mining top-k path traversal patterns, where k is the desired number of path traversal patterns to be mined. An effective summary data structure called TKP-forest (top-k path forest) is used to maintain the essential information about the top-k path traversal patterns of the click-stream so far. Experimental studies show that DSM-TKP algorithm uses stable memory usage and makes only one pass over the streaming data.
  • Keywords
    Internet; data mining; data structures; tree searching; Web click-stream; data stream mining; memory usage; single-pass algorithm; summary data structure; top-k path forest; top-k path traversal pattern; Computer science; Data engineering; Data mining; Data models; Data structures; Databases; Itemsets; Measurement; Monitoring; Telecommunication traffic;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence, 2005. Proceedings. The 2005 IEEE/WIC/ACM International Conference on
  • Print_ISBN
    0-7695-2415-X
  • Type

    conf

  • DOI
    10.1109/WI.2005.56
  • Filename
    1517866