DocumentCode :
651540
Title :
Compressing Locality Sensitive Hashing Tables
Author :
Santoyo, Francisco ; Chavez, E. ; Tellez, Eric S.
Author_Institution :
Div. de Estudios de Posgrado de la Fac. de Ing. Electr., Univ. Michoacana de San Nicolas de Hidalgo, Hidalgo, Mexico
fYear :
2013
fDate :
Oct. 30 2013-Nov. 1 2013
Firstpage :
41
Lastpage :
46
Abstract :
LSH is the industry standard for proximity searching tasks on collections of data having coordinates. An LSH index applies a set of hashing functions to the representation of an object to identify proximal objects to a query, leaving distal objects apart. In other words, objects with the same hash will be mutually proximal with high probability. LSH is very fast and gives probabilistic guarantees on the quality of the results. On the other hand, mobile applications using proximity queries are becoming common place. Feature extraction can be done in a smart phone. However, the actual query rely on a wireless link because memory is a scarce resource. To tackle the above problem, we present in this paper a method to compress the LSH index while still being able to query without decompressing. The query speed is practically the same, and can even be faster. We derive a lower bound on the memory requirements for the compress representation and present an implementation using close to optimal storage. We provide an extensive experimental comparison of our compressed representation against the uncompressed one over a large database of 55 million objects. We obtained a compression ratio ranging from 70% to 80% without slowing down, in practice, the search speed.
Keywords :
cryptography; data acquisition; data compression; probability; query formulation; storage management; LSH index; close-to-optimal storage; compress representation; compression ratio; data collections; distal objects; feature extraction; hashing functions; industry standard; locality sensitive hashing tables; memory requirements; mobile applications; object representation; probabilistic guarantees; proximal objects identification; proximity queries; proximity searching tasks; query speed; smart phone; wireless link; Approximation algorithms; Electronic mail; Indexes; Measurement; Memory management; Probabilistic logic; Locality Sensitive Hashing; Succinct Proximity Search Indexes;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science (ENC), 2013 Mexican International Conference on
Conference_Location :
Morelia
ISSN :
1550-4069
Type :
conf
DOI :
10.1109/ENC.2013.12
Filename :
6679818
Link To Document :
بازگشت