Title :
Similarity search for multidimensional data sequences
Author :
Lee, Seok-Lyong ; Chun, Seok-Ju ; Kim, Deok-Hwan ; Lee, Ju-Hong ; Chung, Chin-Wan
Author_Institution :
Korea Adv. Inst. of Sci. & Technol., Seoul, South Korea
Abstract :
Time series data, which are a series of one dimensional real numbers, have been studied in various database applications. We extend the traditional similarity search methods on time series data to support a multidimensional data sequence, such as a video stream. We investigate the problem of retrieving similar multidimensional data sequences from a large database. To prune irrelevant sequences in a database, we introduce correct and efficient similarity functions. Both data sequences and query sequences are partitioned into subsequences, and each of them is represented by a Minimum Bounding Rectangle (MBR). The query processing is based upon these MBRs, instead of scanning data elements of entire sequences. Our method is designed: (1) to select candidate sequences in a database, and (2) to find the subsequences of a selected sequence, each of which falls under the given threshold. The latter is of special importance in the case of retrieving subsequences from large and complex sequences such as video. By using it, we do not need to browse the whole of the selected video stream, but just browse the sub-streams to find a scene we want. We have performed an extensive experiment on synthetic, as well as real data sequences (a collection of TV news, dramas, and documentary videos) to evaluate our proposed method. The experiment demonstrates that 73-94 percent of irrelevant sequences are pruned using the proposed method, resulting in 16-28 times faster response time compared with that of the sequential search
Keywords :
multimedia databases; query processing; search problems; time series; very large databases; video databases; video signal processing; Minimum Bounding Rectangle; TV news; candidate sequences; complex sequences; data elements; data sequence retrieval; database applications; documentary videos; irrelevant sequences; large database; multidimensional data sequences; one dimensional real numbers; query processing; query sequences; response time; sequential search; similarity functions; similarity search; similarity search methods; subsequences; time series data; video stream; Databases; Design methodology; Information retrieval; Layout; Multidimensional systems; Performance evaluation; Query processing; Search methods; Streaming media; TV;
Conference_Titel :
Data Engineering, 2000. Proceedings. 16th International Conference on
Conference_Location :
San Diego, CA
Print_ISBN :
0-7695-0506-6
DOI :
10.1109/ICDE.2000.839473