Title :
Tracking High Quality Clusters over Uncertain Data Streams
Author :
Zhang, Chen ; Gao, Ming ; Zhou, Aoying
Author_Institution :
Dept. of Comput. Sci. & Eng., Fudan Univ., Fudan
fDate :
March 29 2009-April 2 2009
Abstract :
Recently, data mining over uncertain data streams has attracted a lot of attentions because of the widely existed imprecise data generated from a variety of streaming applications. In this paper, we try to resolve the problem of clustering over uncertain data streams. Facing uncertain tuples with different probability distributions, the clustering algorithm should not only consider the tuple value but also emphasis on its uncertainty. To fulfill these dual purposes, a metric named tuple uncertainty will be integrated into the overall procedure of clustering. Firstly, we survey uncertain data model and propose our uncertainty measurement and corresponding properties. Secondly, based on such uncertainty quantification method, we provide a two phase stream clustering algorithm and elaborate implementation detail. Finally, performance experiments over a number of real and synthetic data sets demonstrate the effectiveness and efficiency of our method.
Keywords :
data mining; pattern clustering; statistical distributions; tracking; clustering algorithm; data mining; probability distribution; quantification method; tuple uncertainty; uncertain data stream; Application software; Clustering algorithms; Computer science; Data engineering; Data mining; Laboratories; Pervasive computing; Quality of service; Software engineering; Uncertainty; Clustering; Data Stream; Uncertainty Data;
Conference_Titel :
Data Engineering, 2009. ICDE '09. IEEE 25th International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-3422-0
Electronic_ISBN :
1084-4627
DOI :
10.1109/ICDE.2009.160