Title :
Comparison of loss functions for linear regression
Author :
Cherkassky, Vladimir ; Ma, Yunqian
Author_Institution :
Dept. of Electr. & Comput. Eng., Minnesota Univ., Minniapolis, MN, USA
Abstract :
This paper addresses selection of the loss function for regression problems with finite data. It is well-known (under standard regression formulation) that for a known noise density there exist an optimal loss function under an asymptotic setting (large number of samples), i.e. squared loss is optimal for Gaussian noise density. However, in real-life applications the noise density is unknown and the number of training samples is finite. For such practical situations, we suggest using Vapnik´s ε-insensitive loss function. We use practical method for setting the value of ε as a function of known number of samples and (known or estimated) noise variance (V. Cherkassky and Y. Ma, (2004), (2002)). We consider commonly used noise densities (such as Gaussian, Uniform and Laplacian noise). Empirical comparisons for several representative linear regression problems indicate that Vapnik´s ε-insensitive loss yields more robust performance and improved prediction accuracy, in comparison with squared loss and least-modulus loss, especially for noisy high-dimensional data sets.
Keywords :
Gaussian noise; learning (artificial intelligence); regression analysis; Gaussian noise density; linear regression problem; loss function; training samples; Accuracy; Additive noise; Gaussian noise; Laplace equations; Linear regression; Noise robustness; Parameter estimation; Performance loss; Statistics; Support vector machines;
Conference_Titel :
Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on
Print_ISBN :
0-7803-8359-1
DOI :
10.1109/IJCNN.2004.1379938