DocumentCode :
3722777
Title :
Fast Approximate near Neighbor Algorithm by Clustering in High Dimensions
Author :
Hung Tran-The;Vinh Nguyen Van;Minh Hoang Anh
Author_Institution :
Z8 FPT Software JSC, Hanoi, Vietnam
fYear :
2015
Firstpage :
274
Lastpage :
279
Abstract :
This paper addresses the (r, 1 + ϵ)-approximate near neighbor problem ( or (r, 1 + v)-NN) that is defined as follows: given a set of n points in a d-dimensional space, a query point q and parameter 0 <; δ <; 1, build a data structure that reports a point within distance (1 + ϵ)r from q with probability 1 - δ, if there is a point in the data set within distance r from q. We present an algorithm for this problem in the Hamming space by using a new clustering technique, the triangle inequality and the locality sensitive hashing approach. Our algorithm achieves O(n1+ 2ρ/1+2ρ + dn) space, and O(fdn 2ρ/1+2ρ) query time, where f is generally a small integer, ρ is a parameter of the algorithm in (STOC 1998) [1] or (STOC 2015) [2]. Our results show that we can improve the algorithms in [1], [2] when value ϵ is small (ϵ <; 1 for [1] and ϵ <; 0.5 for [2]).
Keywords :
"Clustering algorithms","Approximation algorithms","Partitioning algorithms","Approximation methods","Databases","Algorithm design and analysis","Electronic mail"
Publisher :
ieee
Conference_Titel :
Knowledge and Systems Engineering (KSE), 2015 Seventh International Conference on
Type :
conf
DOI :
10.1109/KSE.2015.72
Filename :
7371795
Link To Document :
بازگشت