• DocumentCode
    969313
  • Title

    On loss functions which minimize to conditional expected values and posterior probabilities

  • Author

    Miller, John W. ; Goodman, Rod ; Smyth, Padhraic

  • Author_Institution
    Microsoft Res., Redmond, WA, USA
  • Volume
    39
  • Issue
    4
  • fYear
    1993
  • fDate
    7/1/1993 12:00:00 AM
  • Firstpage
    1404
  • Lastpage
    1408
  • Abstract
    A loss function, or objective function, is a function used to compare parameters when fitting a model to data. The loss function gives a distance between the model output and the desired output. Two common examples are the squared-error loss function and the cross entropy loss function. Minimizing the mean-square error loss function is equivalent to minimizing the mean square difference between the model output and the expected value of the output given a particular input. This property of minimization to the expected value is formalized as P-admissibility. The necessary and sufficient conditions for P-admissibility, leading to a parametric description of all P-admissible loss functions, are found. In particular, it is shown that two of the simplest members of this class of functions are the squared error and the cross entropy loss functions. One application of this work is in the choice of a loss function for training neural networks to provide probability estimates
  • Keywords
    entropy; information theory; learning (artificial intelligence); minimisation; neural nets; probability; P-admissibility; conditional expected values; cross entropy loss function; loss functions; minimization; neural networks; objective function; parametric description; squared-error loss function; training; Bridges; Convergence; Distribution functions; Entropy; Gaussian approximation; Linear regression; Mathematics; Neural networks; Probability distribution; Random variables;
  • fLanguage
    English
  • Journal_Title
    Information Theory, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9448
  • Type

    jour

  • DOI
    10.1109/18.243457
  • Filename
    243457