DocumentCode :
3703614
Title :
Cluster-based data oriented hashing
Author :
Sanaa Chafik;Imane Daoudi;Mounim A. El Yacoubi;Hamid El Ouardi
Author_Institution :
Telecom SudParis / Mines Telecom Institute Paris, France
fYear :
2015
Firstpage :
1
Lastpage :
7
Abstract :
Many multidimensional hashing schemes have been actively studied in recent years, providing efficient nearest neighbor search. Generally, we can distinguish several hashing families, such as learning based hashing, which provides better hash function selectivity by learning the dataset distribution. The spacial hashing family proposes a suitable partition of the multidimensional space, more adapted to data points distribution. In spite of the efficiency of multidimensional hashing techniques to solve the nearest neighbor search problem, these techniques suffer from scalabity issues. In this paper, we propose a novel hashing algorithm, named Cluster Based Data Oriented Hashing, that combines space hashing and learning based hashing techniques. The proposed approach applies first a clustering algorithm for structuring the multidimensional space into clusters. Then, in each cluster, a learning based hashing algorithm is applied by selecting an appropriate hash function that fits the data distribution. Experimental comparisons with standard Euclidean Locality Sensitive Hashing demonstrate the effectiveness of the proposed method for large datasets.
Keywords :
"Clustering algorithms","Indexing","Search problems","Memory management","Lattices","Electronic mail"
Publisher :
ieee
Conference_Titel :
Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on
Print_ISBN :
978-1-4673-8272-4
Type :
conf
DOI :
10.1109/DSAA.2015.7344895
Filename :
7344895
Link To Document :
بازگشت