DocumentCode :
3322041
Title :
Similarity Search in Arbitrary Subspaces Under Lp-Norm
Author :
Lian, Xiang ; Chen, Lei
Author_Institution :
Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Kowloon
fYear :
2008
fDate :
7-12 April 2008
Firstpage :
317
Lastpage :
326
Abstract :
Similarity search has been widely used in many applications such as information retrieval, image data analysis, and time-series matching. Specifically, a similarity query retrieves all data objects in a data set that are similar to a given query object. Previous work on similarity search usually consider the search problem in the full space. In this paper, however, we propose a novel problem, subspace similarity search, which finds all data objects that match with a query object in the subspace instead of the original full space. In particular, the query object can specify arbitrary subspace with arbitrary number of dimensions. Since traditional approaches for similarity search cannot be applied to solve the proposed problem, we introduce an efficient and effective pruning technique, which assigns scores to data objects with respect to pivots and prunes candidates via scores. We propose an effective multipivot-based method to pre-process data objects by selecting appropriate pivots, where the entire procedure is guided by a formal cost model, such that the pruning power is maximized. Finally, scores of each data object are organized in sorted list to facilitate an efficient subspace similarity search. Extensive experiments have verified the correctness of our cost model and demonstrated the efficiency and effectiveness of our proposed approach for the subspace similarity search.
Keywords :
query processing; search problems; arbitrary subspaces; data object retrieval; multipivot-based method; query object; search problem; similarity query; similarity search; Application software; Computer science; Costs; Data analysis; Data engineering; Image analysis; Image databases; Image retrieval; Information retrieval; Search problems;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on
Conference_Location :
Cancun
Print_ISBN :
978-1-4244-1836-7
Electronic_ISBN :
978-1-4244-1837-4
Type :
conf
DOI :
10.1109/ICDE.2008.4497440
Filename :
4497440
Link To Document :
بازگشت