Title :
Access Structures for Angular Similarity Queries
Author :
Apaydin, Tan ; Ferhatosmanoglu, Hakan
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH
Abstract :
Angular similarity measures have been utilized by several database applications to define semantic similarity between various data types such as text documents, time-series, images, and scientific data. Although similarity searches based on Euclidean distance have been extensively studied in the database community, processing of angular similarity searches has been relatively untouched. Problems due to a mismatch in the underlying geometry as well as the high dimensionality of the data make current techniques either inapplicable or their use results in poor performance. This brings up the need for effective indexing methods for angular similarity queries. We first discuss how to efficiently process such queries and propose effective access structures suited to angular similarity measures. In particular, we propose two classes of access structures, namely, angular-sweep and cone-shell, which perform different types of quantization based on the angular orientation of the data objects. We also develop query processing algorithms that utilize these structures as dense indices. The proposed techniques are shown to be scalable with respect to both dimensionality and the size of the data. Our experimental results on real data sets from various applications show two to three orders of magnitude of speedup over the current techniques
Keywords :
database indexing; query processing; angular similarity queries; angular-sweep access structure; cone-shell access structure; database applications; indexing methods; query processing algorithms; query search; semantic similarity; Computer graphics; Euclidean distance; Extraterrestrial measurements; Geometry; Image databases; Indexing; Light sources; Quantization; Query processing; Spatial databases; Angular query; angular similarity measures; high-dimensional data.; indexing; performance;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2006.165