DocumentCode :
1286110
Title :
Local Minimax Learning of Functions With Best Finite Sample Estimation Error Bounds: Applications to Ridge and Lasso Regression, Boosting, Tree Learning, Kernel Machines, and Inverse Problems
Author :
Jones, Lee K.
Author_Institution :
Dept. of Math. Sci., Univ. of Massachusetts, Lowell, MA, USA
Volume :
55
Issue :
12
fYear :
2009
Firstpage :
5700
Lastpage :
5727
Abstract :
Optimal local estimation is formulated in the minimax sense for inverse problems and nonlinear regression. This theory provides best mean squared finite sample error bounds for some popular statistical learning algorithms and also for several optimal improvements of other existing learning algorithms such as smoothing splines and kernel regularization. The bounds and improved algorithms are not based on asymptotics or Bayesian assumptions and are truly local for each query, not depending on cross validating estimates at other queries to optimize modeling parameters. Results are given for optimal local learning of approximately linear functions with side information (context) using real algebraic geometry. In particular, finite sample error bounds are given for ridge regression and for a local version of lasso regression. The new regression methods require only quadratic programming with linear or quadratic inequality constraints for implementation. Greedy additive expansions are then combined with local minimax learning via a change in metric. An optimal strategy is presented for fusing the local minimax estimators of a class of experts-providing optimal finite sample prediction error bounds from (random) forests. Local minimax learning is extended to kernel machines. Best local prediction error bounds for finite samples are given for Tikhonov regularization. The geometry of reproducing kernel Hilbert space is used to derive improved estimators with finite sample mean squared error (MSE) bounds for class membership probability in two class pattern classification problems. A purely local, cross validation free algorithm is proposed which uses Fisher information with these bounds to determine best local kernel shape in vector machine learning. Finally, a locally quadratic solution to the finite Fourier moments problem is presented. After reading the first three sections the reader may proceed directly to any of the subsequent applications sections.
Keywords :
inverse problems; learning (artificial intelligence); regression analysis; Bayesian assumption; Fisher information; Tikhonov regularization; algebraic geometry; class membership probability; cross validation free algorithm; finite Fourier moments; finite sample estimation error bounds; greedy additive expansion; inverse problems; kernel Hilbert space; kernel machines; kernel regularization; lasso regression; local minimax learning; mean squared error bounds; mean squared finite sample error bounds; nonlinear regression; optimal finite sample prediction error bounds; optimal local estimation; optimal local learning; pattern classification; quadratic inequality constraints; quadratic programming; ridge regression; smoothing splines; statistical learning algorithm; tree learning; vector machine learning; Boosting; Estimation error; Information geometry; Inverse problems; Kernel; Machine learning; Minimax techniques; Regression tree analysis; Smoothing methods; Statistical learning; Fusion; inverse problem; minimax; reproducing kernel; ridge regression;
fLanguage :
English
Journal_Title :
Information Theory, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9448
Type :
jour
DOI :
10.1109/TIT.2009.2027479
Filename :
5319754
Link To Document :
بازگشت