Title :
High-order and multilayer perceptron initialization
Author :
Thimm, Georg ; Fiesler, Emile
Author_Institution :
IDIAP, Martigny, Switzerland
fDate :
3/1/1997 12:00:00 AM
Abstract :
Proper initialization is one of the most important prerequisites for fast convergence of feedforward neural networks like high-order and multilayer perceptrons. This publication aims at determining the optimal variance (or range) for the initial weights and biases, which is the principal parameter of random initialization methods for both types of neural networks. An overview of random weight initialization methods for multilayer perceptrons is presented. These methods are extensively tested using eight real-world benchmark data sets and a broad range of initial weight variances by means of more than 30000 simulations, in the aim to find the best weight initialization method for multilayer perceptrons. For high-order networks, a large number of experiments (more than 200000 simulations) was performed, using three weight distributions, three activation functions, several network orders, and the same eight data sets. The results of these experiments are compared to weight initialization techniques for multilayer perceptrons, which leads to the proposal of a suitable initialization method for high-order perceptrons. The conclusions on the initialization methods for both types of networks are justified by sufficiently small confidence intervals of the mean convergence times
Keywords :
convergence; feedforward neural nets; learning (artificial intelligence); multilayer perceptrons; transfer functions; activation functions; biases; confidence intervals; fast convergence; feedforward neural networks; high-order perceptrons; initial weights; mean convergence times; multilayer perceptron; random weight initialization methods; weight distributions; Benchmark testing; Convergence; Feedforward neural networks; Helium; Multi-layer neural network; Multilayer perceptrons; Network topology; Neural networks; Optimization methods; Proposals;
Journal_Title :
Neural Networks, IEEE Transactions on