DocumentCode :
3124984
Title :
Time Series Epenthesis: Clustering Time Series Streams Requires Ignoring Some Data
Author :
Rakthanmanon, Thanawin ; Keogh, Eamonn J. ; Lonardi, Stefano ; Evans, Scott
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of California, Riverside, CA, USA
fYear :
2011
fDate :
11-14 Dec. 2011
Firstpage :
547
Lastpage :
556
Abstract :
Given the pervasiveness of time series data in all human endeavors, and the ubiquity of clustering as a data mining application, it is somewhat surprising that the problem of time series clustering from a single stream remains largely unsolved. Most work on time series clustering considers the clustering of individual time series, e.g., gene expression profiles, individual heartbeats or individual gait cycles. The few attempts at clustering time series streams have been shown to be objectively incorrect in some cases, and in other cases shown to work only on the most contrived datasets by carefully adjusting a large set of parameters. In this work, we make two fundamental contributions. First, we show that the problem definition for time series clustering from streams currently used is inherently flawed, and a new definition is necessary. Second, we show that the Minimum Description Length (MDL) framework offers an efficient, effective and essentially parameter-free method for time series clustering. We show that our method produces objectively correct results on a wide variety of datasets from medicine, zoology and industrial process analyses.
Keywords :
data mining; pattern clustering; time series; data mining application; gene expression profiles; human endeavors; industrial process analyses; medicine; minimum description length framework; parameter free method; time series epenthesis; time series streams clustering; zoology; Clustering algorithms; Data mining; Encoding; Entropy; Euclidean distance; Handicapped aids; Time series analysis; MDL; clustering; time series;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining (ICDM), 2011 IEEE 11th International Conference on
Conference_Location :
Vancouver,BC
ISSN :
1550-4786
Print_ISBN :
978-1-4577-2075-8
Type :
conf
DOI :
10.1109/ICDM.2011.146
Filename :
6137259
Link To Document :
بازگشت