Title :
Fast Algorithm for Approximate k-Nearest Neighbor Graph Construction
Author :
Dilin Wang ; Lei Shi ; Jianwen Cao
Abstract :
The k-Nearest Neighbor (k-NN) graphs are widely used in data mining and machine learning. How to construct a high quality k-NN graph for generic similarity measures efficiently is crucial for many applications. In this paper, we propose a new approach to effectively and efficiently construct an approximate k-NN graph. Our framework is as follows: (1) generate a random k-NN graph approximation, Gf, (2) perform random hierarchical partitions of the space to construct an approximate neighborhood graph Gp, which is then combined with graph Gf to yield a more accurate graph Gm, (3) neighborhood propagation is conducted on Gm to further enhance the accuracy, and output the solution as graph Gf, (4) repeat the process of (2) and (3) several times until a reasonable solution is reached. The experiments on a variety of real data sets and a high intrinsic dimensional synthetic data set verify the high performance of the proposed method and demonstrate that it is superior to previous state-of-the-art k-NN graph construction approaches.
Keywords :
data analysis; data mining; graph theory; learning (artificial intelligence); approximate k-nearest neighbor graph construction; data mining; generic similarity; high intrinsic dimensional synthetic data set; machine learning; neighborhood propagation; random hierarchical partitions; random k-NN graph approximation; real data sets; Accuracy; Algorithm design and analysis; Approximation algorithms; Approximation methods; Complexity theory; Nearest neighbor searches; Partitioning algorithms; approximate algorithm; k-nearest neighbor graph; multiple random division;
Conference_Titel :
Data Mining Workshops (ICDMW), 2013 IEEE 13th International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4799-3143-9
DOI :
10.1109/ICDMW.2013.50