• DocumentCode
    1405464
  • Title

    Building cost functions minimizing to some summary statistics

  • Author

    Saerens, Marco

  • Author_Institution
    IRIDIA Lab., Univ. Libre de Bruxelles, Belgium
  • Volume
    11
  • Issue
    6
  • fYear
    2000
  • fDate
    11/1/2000 12:00:00 AM
  • Firstpage
    1263
  • Lastpage
    1271
  • Abstract
    A learning machine-or a model-is usually trained by minimizing a given criterion (the expectation of the cost function), measuring the discrepancy between the model output and the desired output. As is already well known, the choice of the cost function has a profound impact on the probabilistic interpretation of the output of the model, after training. In this work, we use the calculus of variations in order to tackle this problem. In particular, we derive necessary and sufficient conditions on the cost function ensuring that the output of the trained model approximates 1) the conditional expectation of the desired output given the explanatory variables; 2) the conditional median (and, more generally the q-quantile); 3) the conditional geometric mean; and 4) the conditional variance. The same method could be applied to the estimation of other summary statistics as well. We also argue that the least absolute deviations criterion could, in some cases, act as an alternative to the ordinary least squares criterion for nonlinear regression. In the same vein, the concept of "regression quantile" is briefly discussed.
  • Keywords
    learning (artificial intelligence); minimisation; neural nets; statistical analysis; building cost function minimization; conditional expectation; conditional geometric mean; conditional median; conditional variance; cost function expectation; explanatory variables; learning machine; least absolute deviations criterion; necessary and sufficient conditions; nonlinear regression; probabilistic interpretation; q-quantile; regression quantile; summary statistics; Artificial neural networks; Calculus; Cost function; Least squares approximation; Least squares methods; Machine learning; Mean square error methods; Solid modeling; Statistics; Sufficient conditions;
  • fLanguage
    English
  • Journal_Title
    Neural Networks, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9227
  • Type

    jour

  • DOI
    10.1109/72.883416
  • Filename
    883416