Improved rates and asymptotic normality for nonparametric neural network estimators

Author

Chen, Xiaohong ; White, Halbert

Author_Institution

Dept. of Econ., Chicago Univ., IL, USA

Volume

45

Issue

2

fYear

1999

fDate

3/1/1999 12:00:00 AM

Firstpage

682

Lastpage

691

Abstract

We obtain an improved approximation rate (in Sobolev norm) of r ^-1/2-α(d+1)/ for a large class of single hidden layer feedforward artificial neural networks (ANN) with r hidden units and possibly nonsigmoid activation functions when the target function satisfies certain smoothness conditions. Here, d is the dimension of the domain of the target function, and α∈(0, 1) is related to the smoothness of the activation function. When applying this class of ANNs to nonparametrically estimate (train) a general target function using the method of sieves, we obtain new root-mean-square convergence rates of Op([n/log(n)]^-(1+2α/(d+1))/[4(1+α/(d+1))])=op(n ^-1/4) by letting the number of hidden units τ_n, increase appropriately with the sample size (number of training examples) n. These rates are valid for i.i.d. data as well as for uniform mixing and absolutely regular (β-mixing) stationary time series data. In addition, the rates are fast enough to deliver root-n asymptotic normality for plug-in estimates of smooth functionals using general ANN sieve estimators. As interesting applications to nonlinear time series, we establish rates for ANN sieve estimators of four different multivariate target functions: a conditional mean, a conditional quantile, a joint density, and a conditional density. We also obtain root-n asymptotic normality results for semiparametric model coefficient and average derivative estimators

Keywords

approximation theory; convergence of numerical methods; estimation theory; feedforward neural nets; learning (artificial intelligence); statistical analysis; time series; transfer functions; β-mixing; Sobolev norm; artificial neural networks; asymptotic normality; average derivative estimators; conditional density; conditional mean; conditional quantile; dimension; hidden units; i.i.d. data; improved approximation rate; joint density; method of sieves; multivariate target functions; nonlinear time series; nonparametric neural network estimators; nonsigmoid activation functions; plug-in estimates; regular stationary time series data; root-mean-square convergence rates; sample size; semiparametric model coefficient; sieve estimators; single hidden layer feedforward ANN; smooth functionals; smoothness conditions; statistical inference; target function; training examples; uniform mixing; Artificial neural networks; Associate members; Convergence; Feedforward neural networks; Finance; Gaussian distribution; Kernel; Neural networks; Statistics;

fLanguage

English

Journal_Title

Information Theory, IEEE Transactions on

Publisher

ieee

ISSN

0018-9448

Type

jour

DOI

10.1109/18.749011

Filename

749011