DocumentCode :
1496367
Title :
Efficient Periodicity Mining in Time Series Databases Using Suffix Trees
Author :
Rasheed, Faraz ; Alshalalfa, Mohammed ; Alhajj, Reda
Author_Institution :
Dept. of Comput. Sci., Univ. of Calgary, Calgary, AB, Canada
Volume :
23
Issue :
1
fYear :
2011
Firstpage :
79
Lastpage :
94
Abstract :
Periodic pattern mining or periodicity detection has a number of applications, such as prediction, forecasting, detection of unusual activities, etc. The problem is not trivial because the data to be analyzed are mostly noisy and different periodicity types (namely symbol, sequence, and segment) are to be investigated. Accordingly, we argue that there is a need for a comprehensive approach capable of analyzing the whole time series or in a subsection of it to effectively handle different types of noise (to a certain degree) and at the same time is able to detect different types of periodic patterns; combining these under one umbrella is by itself a challenge. In this paper, we present an algorithm which can detect symbol, sequence (partial), and segment (full cycle) periodicity in time series. The algorithm uses suffix tree as the underlying data structure; this allows us to design the algorithm such that its worstcase complexity is O(k.n2), where k is the maximum length of periodic pattern and n is the length of the analyzed portion (whole or subsection) of the time series. The algorithm is noise resilient; it has been successfully demonstrated to work with replacement, insertion, deletion, or a mixture of these types of noise. We have tested the proposed algorithm on both synthetic and real data from different domains, including protein sequences. The conducted comparative study demonstrate the applicability and effectiveness of the proposed algorithm; it is generally more time-efficient and noise-resilient than existing algorithms.
Keywords :
data mining; database management systems; pattern classification; time series; periodic pattern mining; periodicity detection; suffix trees; time series databases; Algorithm design and analysis; Computer science; Data analysis; Databases; Pattern analysis; Proteins; Testing; Time series analysis; Tree data structures; Weather forecasting; Time series; noise resilient.; periodicity detection; segment periodicity; sequence periodicity; suffix tree; symbol periodicity;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2010.76
Filename :
5467068
Link To Document :
بازگشت