DocumentCode :
2728376
Title :
A three-step clustering algorithm over an evolving data stream
Author :
Liu Li-xiong ; Kang Jing ; Guo Yun-fei ; Huang Hai
Author_Institution :
Nat. Digital Switching Syst. Eng. & Technol. Res. Center, Zhengzhou, China
Volume :
1
fYear :
2009
fDate :
20-22 Nov. 2009
Firstpage :
160
Lastpage :
164
Abstract :
Distinguishing potential new cluster data from outliers is a main problem in mining new pattern from evolving data streams. Meanwhile, all the clustering algorithms inherited from CluStream framework are distribution-based learning which are realized via a sliding window, so this problem becomes more obvious. This paper proposes a three-step clustering algorithm, rDenStream, based on DenStream, which includes outlier retrospect learning. During rDenStream clustering, dropped micro-clusters are stored on outside memory temporarily, and when a new cluster is discovered, these micro-clusters are learned retrospectively to find formally inaccurately-discarded data, which will improve the accuracy of the new cluster. rDenStream has important meaning in applications which require high-accuracy clustering from evolving data. Considering the data stream feature in NIDS, this paper models the arriving time of new pattern data as non-homogeneous Poisson distribution. Experiments over standard data set show its advantage over other methods in the early phase of new pattern discovery.
Keywords :
Poisson distribution; data mining; pattern clustering; CluStream framework; distribution-based learning; evolving data streams; nonhomogeneous Poisson distribution; pattern discovery; rDenStream; three-step clustering algorithm; Analytical models; Clustering algorithms; Computational modeling; Data engineering; Intrusion detection; Knowledge engineering; Mathematics; Research and development; Switching systems; Systems engineering and theory; Clustering; Data mining; Evolving data streams; non-homogeneous Poisson process; retrospect learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Computing and Intelligent Systems, 2009. ICIS 2009. IEEE International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-4754-1
Electronic_ISBN :
978-1-4244-4738-1
Type :
conf
DOI :
10.1109/ICICISYS.2009.5357749
Filename :
5357749
Link To Document :
بازگشت