Title :
SSM : A Frequent Sequential Data Stream Patterns Miner
Author :
Ezeife, C.I. ; Monwar, Mostafa
Author_Institution :
Sch. of Comput. Sci., Windsor Univ., Ont.
fDate :
March 1 2007-April 5 2007
Abstract :
Data stream applications like sensor network data, click stream data, have data arriving continuously at high speed rates and require online mining process capable of delivering current and near accurate results on demand without full access to all historical stored data. Frequent sequential mining is the process of discovering frequent sequential patterns in data sequences as found in applications like Web log access sequences. Mining frequent sequential patterns on data stream applications contend with many challenges such as limited memory for unlimited data, inability of algorithms to scan infinitely flowing original dataset more than once and to deliver current and accurate result on demand. Existing work on mining frequent patterns on data streams are mostly for non-sequential patterns. This paper proposes SSM-algorithm (sequential stream mining-algorithm), that uses three types of data structures (D-List, PLWAP tree and FSP-tree) to handle the complexities of mining frequent sequential patterns in data streams. It summarizes frequency counts of items with the D-list, continuously builds PLWAP tree and mines frequent sequential patterns of batches of stream records, maintains mined frequent sequential patterns incrementally with FSP tree. The proposed algorithm can be deployed to analyze e-commerce data where the primary source of data is click stream data.
Keywords :
data mining; tree data structures; D-List data structures; FSP-tree data structures; PLWAP tree data structures; SSM-Algorithm; Web log access sequences; Web sequential mining; customer access sequence; data sequences; data stream applications; frequent sequential data stream pattern miner; frequent sequential pattern discovery; online mining; sequential stream mining-algorithm; Application software; Computational intelligence; Computer displays; Computer science; Content addressable storage; Data mining; Data structures; Frequency; Intelligent sensors; TV; Click Steam Data; Customer Access Sequence; Frequent Sequential patterns; Stream Mining; Web Sequential Mining;
Conference_Titel :
Computational Intelligence and Data Mining, 2007. CIDM 2007. IEEE Symposium on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0705-2
DOI :
10.1109/CIDM.2007.368862