DocumentCode
2060911
Title
Improved SAX time series data representation based on Relative Frequency and K-Nearest Neighbor Algorithm
Author
Ahmed, Almahdi Mohammed ; Bakar, Azuraliza Abu ; Hamdan, Abdul Razak
Author_Institution
Fac. of Technol. & Inf. Sci., Univ. Kebangsaan Malaysia, Bangi, Malaysia
fYear
2010
fDate
Nov. 29 2010-Dec. 1 2010
Firstpage
1320
Lastpage
1325
Abstract
In this paper we propose a new approach based on Symbolic Aggregate approximation (SAX), called improved iSAX to recognize efficient and accurate discovery of the important patterns, essential for time series data. The original SAX approach allows a very high-quality dimensionality reduction and distance measures to be defined on the symbolic approach and it is based on PAA (Piecewise Aggregate Approximation) representation for dimensionality reduction that minimizes dimensionality. The proposed improved SAX, called iSAX includes the Relative Frequency and K-Nearest Neighbor (RFknn) Algorithm. The main task of the algorithm is to determine the sufficient number of intervals represented as symbolic (alphabet size) that can ensure efficient mining process and a good knowledge model is obtained without major loss of knowledge. We show that iSAX can improve representation preciseness without losing symbolic nature of the original SAX representation. The iSAX is compared with the original SAX and PAA representation, and demonstrate its quality improvement. Ten time series rainfall data sets were used. The experimental results showed that iSAX gives better term of representation and minimum Euclidean Distance.
Keywords
approximation theory; data mining; learning (artificial intelligence); pattern clustering; time series; SAX time series data representation; high-quality dimensionality reduction; improved iSAX; k-nearest neighbor algorithm; knowledge model; minimum Euclidean distance; mining process; piecewise aggregate approximation representation; quality improvement; relative frequency; symbolic aggregate approximation; time series rainfall data sets; data mining; pre-processing and reduction; time series;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Systems Design and Applications (ISDA), 2010 10th International Conference on
Conference_Location
Cairo
Print_ISBN
978-1-4244-8134-7
Type
conf
DOI
10.1109/ISDA.2010.5687092
Filename
5687092
Link To Document