DocumentCode
2638802
Title
High dimensional similarity joins: algorithms and performance evaluation
Author
Koudas, Nick ; Sevcik, K.C.
Author_Institution
Dept. of Comput. Sci., Toronto Univ., Ont., Canada
fYear
1998
fDate
23-27 Feb 1998
Firstpage
466
Lastpage
475
Abstract
Current data repositories include a variety of data types, including audio, images and time series. State of the art techniques for indexing such data and doing query processing rely on a transformation of data elements into points in a multidimensional feature space. Indexing and query processing then take place in the feature space. We study algorithms for finding relationships among points in multidimensional feature spaces, specifically algorithms for multidimensional joins. Like joins of conventional relations, correlations between multidimensional feature spaces can offer valuable information about the data sets involved. We present several algorithmic paradigms for solving the multidimensional join problem, and we discuss their features and limitations. We propose a generalization of the Size Separation Spatial Join algorithm, named Multidimensional Spatial Join (MSJ), to solve the multidimensional join problem. We evaluate MSJ along with several other specific algorithms, comparing their performance for various dimensionalities on both real and synthetic multidimensional data sets. Our experimental results indicate that MSJ, which is based on space filling curves, consistently yields good performance across a wide range of dimensionalities
Keywords
multimedia computing; query processing; relational algebra; spatial data structures; Multidimensional Spatial Join; Size Separation Spatial Join algorithm; algorithmic paradigms; data elements; data repositories; data sets; data types; high dimensional similarity joins; multidimensional feature space; multidimensional join problem; performance evaluation; query processing; space filling curves; synthetic multidimensional data sets; time series; Computer science; Data mining; Feature extraction; Image databases; Indexing; Multidimensional systems; Multimedia databases; Query processing; Spatial databases; Visual databases;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering, 1998. Proceedings., 14th International Conference on
Conference_Location
Orlando, FL
ISSN
1063-6382
Print_ISBN
0-8186-8289-2
Type
conf
DOI
10.1109/ICDE.1998.655809
Filename
655809
Link To Document