Title :
Scalable Multicore k-NN Search via Subspace Clustering for Filtering
Author :
Xiaoxin Tang ; Zhiyi Huang ; Eyers, David ; Mills, Steven ; Minyi Guo
Author_Institution :
Dept. of Comput. Sci. & Eng., Shanghai Jiao Tong Univ., Shanghai, China
Abstract :
k Nearest Neighbors (k-NN) search is a widely used category of algorithms with applications in domains such as computer vision and machine learning. Despite the desire to process increasing amounts of high-dimensional data within these domains, k-NN algorithms scale poorly on multicore systems because they hit a memory wall. In this paper, we propose a novel data filtering strategy for k-NN search algorithms on multicore platforms. By excluding unlikely features during the k-NN search process, this strategy can reduce the amount of computation required as well as the memory footprint. It is complementary to the data selection strategies used in other state-of-the-art k-NN algorithms. A Subspace Clustering for Filtering (SCF) method is proposed to implement the data filtering strategy. Experimental results on four k-NN algorithms show that SCF can significantly improve their performance on three modern multicore platforms with only a small loss of search precision.
Keywords :
feature selection; learning (artificial intelligence); multiprocessing systems; pattern classification; pattern clustering; search problems; SCF method; computer vision; data filtering strategy; data selection strategies; high-dimensional data; k nearest neighbors search; k-NN search algorithms; k-NN search process; machine learning; memory footprint; memory wall; multicore platforms; multicore systems; scalable multicore k-NN search; subspace clustering for filtering method; Approximation algorithms; Clustering algorithms; Estimation; Filtering; Multicore processing; Nearest neighbor searches; Scalability; High-Dimensional Space; Memory Wall; Multicore Systems; Scalability; Subspace Clustering for Filtering; high-dimensional space; k Nearest Neighbors; k Nearest neighbors; memory wall; multicore systems; scalability; subspace clustering for filtering;
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
DOI :
10.1109/TPDS.2014.2372755