Title :
Computational intelligence methods for processing misaligned, unevenly sampled time series containing missing data
Author :
Cismondi, Federico ; Fialho, André S. ; Vieira, Susana M. ; Sousa, Joao M C ; Reti, Shane R. ; Howell, Michael D. ; Finkelstein, Stan N.
Author_Institution :
Eng. Syst. Div., Massachusetts Inst. of Technol., Cambridge, MA, USA
Abstract :
One consequence of the increasing amount of data stored during acquisition processes is that sampled time series are more prone to be collected in a misaligned uneven fashion and/or be partly lost or unavailable (missing data). Due to their severe impact on data mining techniques, this work proposes methods to (a) align misaligned unevenly sampled data, (b) differentiate absent values related to low sampling frequencies, compared to those resulting from missingness mechanisms, and (c) to classify recoverable and non-recoverable segments of missing data by using statistical and fuzzy modeling approaches. These methods were evaluated against randomly simulated test datasets containing different amounts of missing data. Results show that: (1) using the variable most frequently sampled as a template, combined with cubic interpolation, allowed to unshift misaligned uneven data without significant errors; (2) the differentiation of absent values due to low sampling frequencies from those truly missing, can be successfully performed using 95% confidence intervals relative to the mean sampling time; (3) fuzzy modeling returned better classification results for recoverable segments, while the statistical approach performed better in classifying non-recoverable segments. All three methods proposed in this work decreased their performance when the amount of missing data was increased in the test datasets.
Keywords :
data mining; fuzzy set theory; statistical analysis; time series; computational intelligence; data mining; fuzzy modeling approach; mean sampling time; missing data; statistical approach; unevenly sampled time series; Data mining; Data models; Databases; Interpolation; Time frequency analysis; Time measurement; Time series analysis;
Conference_Titel :
Computational Intelligence and Data Mining (CIDM), 2011 IEEE Symposium on
Conference_Location :
Paris
Print_ISBN :
978-1-4244-9926-7
DOI :
10.1109/CIDM.2011.5949447