DocumentCode :
2541309
Title :
A fuzzy c means variant for clustering evolving data streams
Author :
Hore, Prodip ; Hall, Lawrence O. ; Goldgof, Dmitry B.
Author_Institution :
Univ. of South Florida, Tampa
fYear :
2007
fDate :
7-10 Oct. 2007
Firstpage :
360
Lastpage :
365
Abstract :
Clustering algorithms for streaming data sets are gaining importance due to the availability of large data streams from different sources. Recently a number of streaming algorithms have been proposed using crisp algorithms such as hard c means or its variants. The crisp cases may not be easily generalized to fuzzy cases as these two groups of algorithms try to optimize different objective functions. In this paper we propose a streaming variant of the fuzzy c means algorithm. At any stage during processing, a good streaming algorithm should be able to summarize data seen so far and also respond to evolving distributions. We study the tradeoff involved between summarization of data seen and response to an evolving distribution by varying the amount of history used by a streaming algorithm. Empirical evaluation of the performance of our algorithm using both artificial and real data sets under a noisy setting shows its effectiveness.
Keywords :
fuzzy set theory; pattern clustering; crisp algorithms; data sets; data streams; fuzzy c means variant; hard c means; objective functions; Clustering algorithms; Fuzzy sets; History; Monitoring; Statistical distributions; Telephony;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man and Cybernetics, 2007. ISIC. IEEE International Conference on
Conference_Location :
Montreal, Que.
Print_ISBN :
978-1-4244-0990-7
Electronic_ISBN :
978-1-4244-0991-4
Type :
conf
DOI :
10.1109/ICSMC.2007.4413710
Filename :
4413710
Link To Document :
بازگشت