Title :
rDenStream, A Clustering Algorithm over an Evolving Data Stream
Author :
Liu Li-xiong ; Huang Hai ; Guo Yun-fei ; Chen Fu-Cai
Author_Institution :
Nat. Digital Switching Syst. Eng. & Technol. Res. Center, Zhengzhou, China
Abstract :
For mining new pattern from evolving data streams, most algorithms are inherited from DenStream framework which is realized via a sliding window. So at the early stage of a pattern emerges, its knowledge points can be easily mistaken as outliers and dropped. In most cases, these points can be ignored, but in some special applications which need to quickly and precisely master the emergence rule of some patterns, these points must play their rules. Based on DenStream, this paper proposes a three-step clustering algorithm, rDenStream, which presents the concept of outlier retrospect. In rDenStream clustering, dropped micro-clusters are stored on outside memory temporarily, and will be given new chance to attend clustering to improve the clustering accuracy. Experiments modeled the arrival of data stream in Poisson process, and the results over standard data set showed its advantage over other methods in the early phase of new pattern discovery.
Keywords :
data mining; pattern clustering; stochastic processes; Poisson process; data stream; pattern discovery; pattern mining; rDenStream; three-step clustering algorithm; Analytical models; Clustering algorithms; Computational modeling; Data engineering; Intrusion detection; Mathematics; Partitioning algorithms; Research and development; Switching systems; Systems engineering and theory;
Conference_Titel :
Information Engineering and Computer Science, 2009. ICIECS 2009. International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-4994-1
DOI :
10.1109/ICIECS.2009.5363379