Title :
Adaptive stochastic approximation by the simultaneous perturbation method
Author_Institution :
Appl. Phys. Lab., Johns Hopkins Univ., Laurel, MD, USA
Abstract :
Stochastic approximation (SA) has long been applied for problems of minimizing loss functions or root finding with noisy input information. As with all stochastic search algorithms, there are adjustable algorithm coefficients that must be specified, and that can have a profound effect on algorithm performance. It is known that choosing these coefficients according to an SA analog of the deterministic Newton-Raphson algorithm provides an optimal or near-optimal form of the algorithm. However, directly determining the required Hessian matrix (or Jacobian matrix for root finding) to achieve this algorithm form has often been difficult or impossible in practice. The paper presents a general adaptive SA algorithm that is based on a simple method for estimating the Hessian matrix, while concurrently estimating the primary parameters of interest. The approach applies in both the gradient-free optimization (Kiefer-Wolfowitz) and root-finding/stochastic gradient-based (Robbins-Monro) settings, and is based on the "simultaneous perturbation (SP)" idea introduced previously. The algorithm requires only a small number of loss function or gradient measurements per iteration-independent of the problem dimension-to adaptively estimate the Hessian and parameters of primary interest. Aside from introducing the adaptive SP approach, the paper presents practical implementation guidance, asymptotic theory, and a nontrivial numerical evaluation. Also included is a discussion and numerical analysis comparing the adaptive SP approach with the iterate-averaging approach to accelerated SA.
Keywords :
Hessian matrices; Jacobian matrices; convergence; gradient methods; parameter estimation; search problems; adaptive stochastic approximation; asymptotic theory; gradient-free optimization; implementation guidance; nontrivial numerical evaluation; numerical analysis; root finding; simultaneous perturbation method; stochastic gradient-based setting; stochastic search algorithms; Acceleration; Adaptive control; Backpropagation algorithms; Constraint optimization; Jacobian matrices; Least squares approximation; Noise measurement; Parameter estimation; Perturbation methods; Stochastic processes;
Journal_Title :
Automatic Control, IEEE Transactions on
DOI :
10.1109/TAC.2000.880982