DocumentCode :
916926
Title :
Reduced Support Vector Machines: A Statistical Theory
Author :
Lee, Yuh-Jye ; Huang, Su-Yun
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng, Nat. Taiwan Univ. of Sci. & Technol., Taipei
Volume :
18
Issue :
1
fYear :
2007
Firstpage :
1
Lastpage :
13
Abstract :
In dealing with large data sets, the reduced support vector machine (RSVM) was proposed for the practical objective to overcome some computational difficulties as well as to reduce the model complexity. In this paper, we study the RSVM from the viewpoint of sampling design, its robustness, and the spectral analysis of the reduced kernel. We consider the nonlinear separating surface as a mixture of kernels. Instead of a full model, the RSVM uses a reduced mixture with kernels sampled from certain candidate set. Our main results center on two major themes. One is the robustness of the random subset mixture model. The other is the spectral analysis of the reduced kernel. The robustness is judged by a few criteria as follows: 1) model variation measure; 2) model bias (deviation) between the reduced model and the full model; and 3) test power in distinguishing the reduced model from the full one. For the spectral analysis, we compare the eigenstructures of the full kernel matrix and the approximation kernel matrix. The approximation kernels are generated by uniform random subsets. The small discrepancies between them indicate that the approximation kernels can retain most of the relevant information for learning tasks in the full kernel. We focus on some statistical theory of the reduced set method mainly in the context of the RSVM. The use of a uniform random subset is not limited to the RSVM. This approach can act as a supplemental algorithm on top of a basic optimization algorithm, wherein the actual optimization takes place on the subset-approximated data. The statistical properties discussed in this paper are still valid
Keywords :
eigenvalues and eigenfunctions; random processes; set theory; spectral analysis; statistical analysis; support vector machines; eigenstructures; nonlinear kernel matrix; random subset mixture model; reduced support vector machines; spectral analysis; statistical theory; Kernel; Machine learning; Monte Carlo methods; Power measurement; Robustness; Sampling methods; Spectral analysis; Support vector machine classification; Support vector machines; Testing; Canonical angles; Monte Carlo sampling; Nyström approximation; kernel methods; maximinity; minimaxity; model complexity; reduced set; spectral analysis; support vector machines (SVMs); uniform design; uniform random subset; Algorithms; Artificial Intelligence; Cluster Analysis; Computer Simulation; Computing Methodologies; Models, Statistical; Pattern Recognition, Automated;
fLanguage :
English
Journal_Title :
Neural Networks, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9227
Type :
jour
DOI :
10.1109/TNN.2006.883722
Filename :
4049824
Link To Document :
بازگشت