DocumentCode
2728376
Title
A three-step clustering algorithm over an evolving data stream
Author
Liu Li-xiong ; Kang Jing ; Guo Yun-fei ; Huang Hai
Author_Institution
Nat. Digital Switching Syst. Eng. & Technol. Res. Center, Zhengzhou, China
Volume
1
fYear
2009
fDate
20-22 Nov. 2009
Firstpage
160
Lastpage
164
Abstract
Distinguishing potential new cluster data from outliers is a main problem in mining new pattern from evolving data streams. Meanwhile, all the clustering algorithms inherited from CluStream framework are distribution-based learning which are realized via a sliding window, so this problem becomes more obvious. This paper proposes a three-step clustering algorithm, rDenStream, based on DenStream, which includes outlier retrospect learning. During rDenStream clustering, dropped micro-clusters are stored on outside memory temporarily, and when a new cluster is discovered, these micro-clusters are learned retrospectively to find formally inaccurately-discarded data, which will improve the accuracy of the new cluster. rDenStream has important meaning in applications which require high-accuracy clustering from evolving data. Considering the data stream feature in NIDS, this paper models the arriving time of new pattern data as non-homogeneous Poisson distribution. Experiments over standard data set show its advantage over other methods in the early phase of new pattern discovery.
Keywords
Poisson distribution; data mining; pattern clustering; CluStream framework; distribution-based learning; evolving data streams; nonhomogeneous Poisson distribution; pattern discovery; rDenStream; three-step clustering algorithm; Analytical models; Clustering algorithms; Computational modeling; Data engineering; Intrusion detection; Knowledge engineering; Mathematics; Research and development; Switching systems; Systems engineering and theory; Clustering; Data mining; Evolving data streams; non-homogeneous Poisson process; retrospect learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Computing and Intelligent Systems, 2009. ICIS 2009. IEEE International Conference on
Conference_Location
Shanghai
Print_ISBN
978-1-4244-4754-1
Electronic_ISBN
978-1-4244-4738-1
Type
conf
DOI
10.1109/ICICISYS.2009.5357749
Filename
5357749
Link To Document