Title :
Similarity searching for multi-attribute sequences
Author :
Kahveci, Tamer ; Singh, Ambuj ; Gürel, Aliekber
Author_Institution :
Dept. of Comput. Sci., California Univ., Santa Barbara, CA, USA
Abstract :
We investigate the problem of searching similar multiattribute time sequences. Such sequences arise naturally in a number of medical, financial, video, weather forecast, and stock market databases where more than one attribute is of interest at a time instant. We first solve the simple case in which the distance is defined as the Euclidean distance. Later we extend it to shift and scale invariance. We formulate a new symmetric scale and shift invariant notion of distance for such sequences. We also propose a new index structure that transforms the data sequences and clusters them according to their shiftings and scalings. This clustering improves the efficiency considerably. According to our experiments with real and synthetic datasets, the index structure´s performance is 5 to 45 times better than competing techniques, the exact speedup based on other optimizations such as caching and replication.
Keywords :
sequences; temporal databases; time series; Euclidean distance; clustering; data sequences; financial databases; index structure; medical databases; multi-attribute sequences; scalings; shift invariant notion; shiftings; similarity searching; stock market databases; symmetric scale; time sequences; video databases; weather forecast databases; Computer science; Databases; Discrete Fourier transforms; Economic forecasting; Euclidean distance; Indexes; Mathematics; Stock markets; Wavelet transforms; Weather forecasting;
Conference_Titel :
Scientific and Statistical Database Management, 2002. Proceedings. 14th International Conference on
Print_ISBN :
0-7695-1632-7
DOI :
10.1109/SSDM.2002.1029718