Title :
A statistical approach to learning and generalization in layered neural networks
Author :
Levin, Esther ; Tishby, Naftali ; Solla, Sara A.
Author_Institution :
AT&T Bell Lab., Murray Hill, NJ, USA
fDate :
10/1/1990 12:00:00 AM
Abstract :
A general statistical description of the problem of learning from examples is presented. Learning in layered networks is posed as a search in the network parameter space for a network that minimizes an additive error function of a statistically independent examples. By imposing the equivalence of the minimum error and the maximum likelihood criteria for training the network, the Gibbs distribution on the ensemble of networks with a fixed architecture is derived. The probability of correct prediction of a novel example can be expressed using the ensemble, serving as a measure to the network´s generalization ability. The entropy of the prediction distribution is shown to be a consistent measure of the network´s performance. The proposed formalism is applied to the problems of selecting an optimal architecture and the prediction of learning curves
Keywords :
learning systems; neural nets; statistical analysis; entropy; layered neural networks; learning curves; maximum likelihood criteria; network training; prediction distribution; statistical description; Entropy; Intelligent networks; Maximum likelihood estimation; Neural networks; Parameter estimation; Parametric statistics; Probability; Stochastic processes; Supervised learning; Training data;
Journal_Title :
Proceedings of the IEEE