Title :
Error surfaces for multilayer perceptrons
Author :
Hush, Don R. ; Horne, Bill ; Salas, John M.
Author_Institution :
Dept. of Electr. & Comput. Eng., New Mexico Univ., Albuquerque, NM, USA
Abstract :
Characteristics of error surfaces for the multilayer perceptron neural network that help explain why learning techniques that use hill-climbing methods are so slow in these networks and also provide insights into techniques to speed learning are examined. First, the surface has a stair-step appearance with many very flat and very steep regions. When the number of training samples is small there is often a one-to-one correspondence between individual training samples and the steps on the surface. As the number of samples increases, the surface becomes smoother. In addition the surface has flat regions that extend to infinity in all directions, making it dangerous to apply learning algorithms that perform line searches. The magnitude of the gradients on the surface strongly supports the need for floating-point representations during learning. The consequences of various weight initialization techniques are also discussed
Keywords :
feedforward neural nets; learning (artificial intelligence); error surfaces; floating-point representations; hill-climbing methods; learning techniques; line searches; multilayer perceptrons; neural network; stair-step appearance; training samples; weight initialization techniques; Backpropagation algorithms; Convergence; Fuzzy set theory; H infinity control; Information processing; Logic; Multi-layer neural network; Multilayer perceptrons; Neural networks; Pattern recognition;
Journal_Title :
Systems, Man and Cybernetics, IEEE Transactions on