Title :
A fuzzy c means variant for clustering evolving data streams
Author :
Hore, Prodip ; Hall, Lawrence O. ; Goldgof, Dmitry B.
Author_Institution :
Univ. of South Florida, Tampa
Abstract :
Clustering algorithms for streaming data sets are gaining importance due to the availability of large data streams from different sources. Recently a number of streaming algorithms have been proposed using crisp algorithms such as hard c means or its variants. The crisp cases may not be easily generalized to fuzzy cases as these two groups of algorithms try to optimize different objective functions. In this paper we propose a streaming variant of the fuzzy c means algorithm. At any stage during processing, a good streaming algorithm should be able to summarize data seen so far and also respond to evolving distributions. We study the tradeoff involved between summarization of data seen and response to an evolving distribution by varying the amount of history used by a streaming algorithm. Empirical evaluation of the performance of our algorithm using both artificial and real data sets under a noisy setting shows its effectiveness.
Keywords :
fuzzy set theory; pattern clustering; crisp algorithms; data sets; data streams; fuzzy c means variant; hard c means; objective functions; Clustering algorithms; Fuzzy sets; History; Monitoring; Statistical distributions; Telephony;
Conference_Titel :
Systems, Man and Cybernetics, 2007. ISIC. IEEE International Conference on
Conference_Location :
Montreal, Que.
Print_ISBN :
978-1-4244-0990-7
Electronic_ISBN :
978-1-4244-0991-4
DOI :
10.1109/ICSMC.2007.4413710