DocumentCode :
2153710
Title :
Searching in one billion vectors: Re-rank with source coding
Author :
Jégou, Hervé ; Tavenard, Romain ; Douze, Matthijs ; Amsaleg, Laurent
fYear :
2011
fDate :
22-27 May 2011
Firstpage :
861
Lastpage :
864
Abstract :
Recent indexing techniques inspired by source coding have been shown successful to index billions of high-dimensional vectors in memory. In this paper, we propose an approach that re-ranks the neighbor hypotheses obtained by these compressed-domain indexing methods. In contrast to the usual post-verification scheme, which performs exact distance calculation on the short-list of hypotheses, the estimated distances are refined based on short quantization codes, to avoid reading the full vectors from disk. We have released a new public dataset of one billion 128 dimensional vectors and proposed an experimental setup to evaluate high dimensional indexing algorithms on a realistic scale. Experiments show that our method accurately and efficiently re-ranks the neighbor hypotheses using little memory compared to the full vectors representation.
Keywords :
indexing; source coding; compressed-domain indexing method; high dimensional indexing algorithm; post verification scheme; source coding; vector representation; Approximation algorithms; Approximation methods; Artificial neural networks; Indexing; Quantization; Source coding; high dimensional indexing; large databases; nearest neighbor search; quantization; source coding;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
ISSN :
1520-6149
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2011.5946540
Filename :
5946540
Link To Document :
بازگشت