Title :
A change detector for mining frequent patterns over evolving data streams
Author :
Ng, Willie ; Dash, Manoranjan
Author_Institution :
Centre for Adv. Inf. Syst., Nanyang Technol. Univ., Singapore
Abstract :
Mining data streams for frequent patterns is important in many applications. Unlike traditional static databases, the underlying process that generates the data streams evolves over time. Past data may become outdated and of little use when compared to the most recent one. When a significant change occurs, much harm is done to the mining result if it is not properly handled. In this paper, an online algorithm for change detection in frequent pattern mining is proposed. Although there have been several studies mainly for adapting to changes, we contend that this is not enough. The ability to detect and characterize change is essential in many applications. A novel test strategy is designed to gather the ldquoevidencerdquo sufficient to conclude on whether the current sample differ significantly from a reference sample. Different statistical tests are evaluated and our study shows that the chi-square test is the most suitable for enumerated or count data.
Keywords :
data mining; statistical testing; change detector; chi-square test; data streams; frequent pattern mining; statistical tests; Application software; Change detection algorithms; Data mining; Databases; Detectors; Feeds; Information systems; Itemsets; Sampling methods; Testing; Change Detection; Data Stream; Frequent Pattern Mining;
Conference_Titel :
Systems, Man and Cybernetics, 2008. SMC 2008. IEEE International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-1-4244-2383-5
Electronic_ISBN :
1062-922X
DOI :
10.1109/ICSMC.2008.4811655