Dependent Online Kernel Learning With Constant Number of Random Fourier Features

Author

Zhen Hu ; Ming Lin ; Changshui Zhang

Author_Institution

Dept. of Autom., Tsinghua Univ., Beijing, China

Volume

26

Issue

10

fYear

2015

Firstpage

2464

Lastpage

2476

Abstract

Traditional online kernel learning analysis assumes independently identically distributed (i.i.d.) about the training sequence. Recent studies reveal that when the loss function is smooth and strongly convex, given T i.i.d. training instances, a constant sampling complexity of random Fourier features is sufficient to ensure O(log T/T) convergence rate of excess risk, which is optimal in online kernel learning up to a log T factor. However, the i.i.d. hypothesis is too strong in practice, which greatly impairs their value. In this paper, we study the sampling complexity of random Fourier features in online kernel learning under non-i.i.d. assumptions. We prove that the sampling complexity under non-i.i.d. settings is also constant, but the convergence rate of excess risk is O(log T/T+ φ ), where φ is the mixing coefficient measuring the extent of non-i.i.d. of training sequence. We conduct experiments both on artificial and real large-scale data sets to verify our theories.

Keywords

Fourier analysis; computational complexity; learning (artificial intelligence); O(log T/T) convergence rate; constant sampling complexity; dependent online kernel learning analysis; i.i.d. hypothesis; log T factor; loss function; random Fourier features; training sequence; Complexity theory; Convergence; Kernel; Learning systems; Stochastic processes; Support vector machines; Training; Dependent data; PEGASOS; online learning; random Fourier features; random Fourier features.;

fLanguage

English

Journal_Title

Neural Networks and Learning Systems, IEEE Transactions on

Publisher

ieee

ISSN

2162-237X

Type

jour

DOI

10.1109/TNNLS.2014.2387313

Filename

7017528