DocumentCode
2776709
Title
Improving kernel locality-sensitive hashing using pre-images and bounds
Author
Bodò, Zalàn ; Csatò, Lehel
Author_Institution
Dept. of Comput. Sci., Babes-Bolyai Univ., Cluj-Napoca, Romania
fYear
2012
fDate
10-15 June 2012
Firstpage
1
Lastpage
8
Abstract
Large databases become more and more common in (supervised) learning scenarios, containing hundred thousands or even millions of training examples. Finding the k-nearest neighbors (k-NN) of a point from a dataset, however, requires to compare the point to every training example. Locality-sensitive hashing (LSH) [11], [7], [3] hashes the dataset into buckets such that, with high probability, similar examples are grouped together, thus providing a sub-linear search time for neighbors. However, linear k-NN is sometimes not enough; the Euclidean distance does not always capture important data properties, therefore kernels are used to map data into a - possibly higher dimensional - feature space and perform the k-NN search there. To kernelize the LSH from [3], the most important question to be answered is how to generate random normally distributed vectors in the feature space. In this paper we present an improved kernel LSH technique, a modified version of the kLSH algorithm proposed in [12]. We compute the pre-images of the random feature space vectors to save important computational resources. Our proposal of pre-image calculation is interesting, because no additional intrinsic computations are required. Furthermore, for positive definite kernel functions we propose two inequalities to speed up searching.
Keywords
image classification; learning (artificial intelligence); probability; search problems; computational resource; dataset; distributed vector; improved kernel LSH technique; k-nearest neighbor; kLSH algorithm; kernel function; kernel locality-sensitive hashing; large database; linear k-NN; preimage calculation; probability; random feature space vector; sublinear search time; supervised learning; Clustering algorithms; Indexes; Kernel; Testing; Training; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Networks (IJCNN), The 2012 International Joint Conference on
Conference_Location
Brisbane, QLD
ISSN
2161-4393
Print_ISBN
978-1-4673-1488-6
Electronic_ISBN
2161-4393
Type
conf
DOI
10.1109/IJCNN.2012.6252742
Filename
6252742
Link To Document