DocumentCode :
3347843
Title :
A gentle Hessian for efficient gradient descent
Author :
Collobert, Ronan ; Bengio, Samy
Author_Institution :
IDIAP, Martigny, Switzerland
Volume :
5
fYear :
2004
fDate :
17-21 May 2004
Abstract :
Several second-order optimization methods for gradient descent algorithms have been proposed over the years, but they usually need to compute the inverse of the Hessian of the cost function (or an approximation of this inverse) during training. In most cases, this leads to an O(n2) cost in time and space per iteration, where n is the number of parameters, which is prohibitive for large n. We propose instead a study of the Hessian before training. Based on a second order analysis, we show that a block-diagonal Hessian yields an easier optimization problem than a full Hessian. We also show that the condition of block-diagonality in common machine learning models can be achieved by simply selecting an appropriate training criterion. Finally, we propose a version of the SVM criterion applied to MLPs, which verifies the aspects highlighted in this second order analysis, but also yields very good generalization performance in practice, taking advantage of the margin effect. Several empirical comparisons on two benchmark datasets are given to illustrate this approach.
Keywords :
Hessian matrices; gradient methods; learning (artificial intelligence); multilayer perceptrons; optimisation; support vector machines; MLP; SVM criterion; block-diagonal matrix; cost function; gentle Hessian matrix; gradient descent algorithms; inverse matrix; machine learning; multilayer perceptrons; second-order optimization methods; support vector machines; training criterion; Approximation algorithms; Cost function; Iterative algorithms; Machine learning; Multilayer perceptrons; Optimization methods; Performance analysis; Stochastic processes; Support vector machine classification; Support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-8484-9
Type :
conf
DOI :
10.1109/ICASSP.2004.1327161
Filename :
1327161
Link To Document :
بازگشت