Title :
Experimental evaluation of selected tree structures for exact and approximate k-nearest neighbor classification
Author :
Cislak, Aleksander ; Grabowski, Szymon
Author_Institution :
Dept. of Inf., Tech. Univ. of Munich, Garching, Germany
Abstract :
Spatial data structures, for vector or metric spaces, are a well-known means to speed-up proximity queries. One of the common uses of the found neighbors of the query object is in classification methods, e.g., the famous k-nearest neighbor algorithm. Still, most experimental works focus on providing attractive tradeoffs between neighbor search times and the neighborhood quality, but they ignore the impact of such tradeoffs on the classification accuracy. In this paper, we explore a few simple approximate and probabilistic variants of two popular spatial data structures, the k-d tree and the ball tree, with k-NN results on real data sets. The main difference between these two structures is the location of input data - in all nodes (k-d tree), or in the leaves (ball tree) - and for this reason they act as good representatives of other spatial structures. We show that in several cases significant speedups compared to the use of such structures in the exact k-NN classification are possible, with a moderate penalty in accuracy. We conclude that the usage of the k-d tree is a more promising approach.
Keywords :
pattern classification; tree data structures; ball tree; k-NN classification; k-d tree; k-nearest neighbor classification; spatial data structures; tree structures;
Conference_Titel :
Computer Science and Information Systems (FedCSIS), 2014 Federated Conference on
Conference_Location :
Warsaw