Title :
Stochastic Subset Selection for Learning With Kernel Machines
Author :
Rhinelander, Jason ; Liu, Xiaoping P.
Author_Institution :
Dept. of Syst. & Comput. Eng., Carleton Univ., Ottawa, ON, Canada
fDate :
6/1/2012 12:00:00 AM
Abstract :
Kernel machines have gained much popularity in applications of machine learning. Support vector machines (SVMs) are a subset of kernel machines and generalize well for classification, regression, and anomaly detection tasks. The training procedure for traditional SVMs involves solving a quadratic programming (QP) problem. The QP problem scales super linearly in computational effort with the number of training samples and is often used for the offline batch processing of data. Kernel machines operate by retaining a subset of observed data during training. The data vectors contained within this subset are referred to as support vectors (SVs). The work presented in this paper introduces a subset selection method for the use of kernel machines in online, changing environments. Our algorithm works by using a stochastic indexing technique when selecting a subset of SVs when computing the kernel expansion. The work described here is novel because it separates the selection of kernel basis functions from the training algorithm used. The subset selection algorithm presented here can be used in conjunction with any online training technique. It is important for online kernel machines to be computationally efficient due to the real-time requirements of online environments. Our algorithm is an important contribution because it scales linearly with the number of training samples and is compatible with current training techniques. Our algorithm outperforms standard techniques in terms of computational efficiency and provides increased recognition accuracy in our experiments. We provide results from experiments using both simulated and real-world data sets to verify our algorithm.
Keywords :
indexing; learning (artificial intelligence); pattern classification; quadratic programming; regression analysis; security of data; set theory; support vector machines; QP problem; SVM; anomaly detection task; classification task; computational efficiency; data vector; kernel basis function; kernel expansion; kernel machines; machine learning; offline data batch processing; quadratic programming; recognition accuracy; regression task; stochastic indexing technique; stochastic subset selection; support vector machines; training algorithm; Computational complexity; Kernel; Machine learning; Noise; Support vector machines; Training; Vectors; Kernel machine; online learning; pattern recognition; support vector machine (SVM); Algorithms; Artificial Intelligence; Computer Simulation; Decision Support Techniques; Models, Statistical; Pattern Recognition, Automated; Stochastic Processes;
Journal_Title :
Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
DOI :
10.1109/TSMCB.2011.2171680