• DocumentCode
    1034616
  • Title

    An efficient constrained training algorithm for feedforward networks

  • Author

    Karras, Dimitris A. ; Perantonis, Stavros J.

  • Author_Institution
    Inst. of Inf. & Telecommun., Nat. Res. Centre, Athens, Greece
  • Volume
    6
  • Issue
    6
  • fYear
    1995
  • fDate
    11/1/1995 12:00:00 AM
  • Firstpage
    1420
  • Lastpage
    1434
  • Abstract
    A novel algorithm is presented which supplements the training phase in feedforward networks with various forms of information about desired learning properties. This information is represented by conditions which must be satisfied in addition to the demand for minimization of the usual mean square error cost function. The purpose of these conditions is to improve convergence, learning speed, and generalization properties through prompt activation of the hidden units, optimal alignment of successive weight vector offsets, elimination of excessive hidden nodes, and regulation of the magnitude of search steps in the weight space. The algorithm is applied to several small- and large-scale binary benchmark training tasks, to test its convergence ability and learning speed, as well as to a large-scale OCR problem, to test its generalization capability. Its performance in terms of percentage of local minima, learning speed, and generalization ability is evaluated and found superior to the performance of the backpropagation algorithm and variants thereof taking especially into account the statistical significance of the results
  • Keywords
    convergence; feedforward neural nets; generalisation (artificial intelligence); learning (artificial intelligence); optical character recognition; binary benchmark training tasks; constrained training algorithm; convergence; feedforward networks; generalization properties; large-scale OCR problem; learning speed; local minima; mean square error cost function; minimization; search steps; training phase; Associate members; Backpropagation algorithms; Benchmark testing; Convergence; Cost function; Large-scale systems; Mean square error methods; Optical character recognition software; Scalability; Supervised learning;
  • fLanguage
    English
  • Journal_Title
    Neural Networks, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9227
  • Type

    jour

  • DOI
    10.1109/72.471365
  • Filename
    471365