• DocumentCode
    2308220
  • Title

    Regularization learning and early stopping in linear networks

  • Author

    Hagiwara, Katsuyuki ; Kuno, Kazuhiro

  • Author_Institution
    Fac. of Phys. Eng., Mie Univ., Tsu, Japan
  • Volume
    4
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    511
  • Abstract
    Generally, learning is performed so as to minimize the sum of squared errors between network outputs and training data. Unfortunately, this procedure does not necessarily give us a network with good generalization ability when the number of connection weights are relatively large. In such situation, overfitting to the training data occurs. To overcome this problem: there are several approaches such as regularization learning and early stopping. It has been suggested that these two methods are closely related. In this article, we firstly give an unified interpretation for the relationship between two methods through the analysis of linear networks in the context of statistical regression; i.e. linear regression model. On the other hand, several theoretical works have been done on the optimal regularization parameter and the optimal stopping time. Here, we also consider the problem from the unified viewpoint mentioned above. This analysis enables us to understand the statistical meaning of the optimality. Then, the estimates of the optimal regularization parameter and the optimal stopping time are present and those are examined by simple numerical simulations. Moreover, for the choice of regularization parameter, the relationship between the Bayesian framework and the generalization error minimization framework is discussed
  • Keywords
    belief networks; generalisation (artificial intelligence); learning (artificial intelligence); neural nets; Bayesian framework; early stopping; generalization error minimization; learning; linear regression; optimal regularization; regularization learning; Bayesian methods; Context modeling; Integrated circuit noise; Intelligent networks; Least squares approximation; Numerical simulation; Physics; Probability distribution; Training data; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2000. IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on
  • Conference_Location
    Como
  • ISSN
    1098-7576
  • Print_ISBN
    0-7695-0619-4
  • Type

    conf

  • DOI
    10.1109/IJCNN.2000.860822
  • Filename
    860822