Title of article :
Class distribution estimation based on the Hellinger distance
Author/Authors :
V?ctor Gonz?lez-Castro، نويسنده , , Roc?o Alaiz-Rodr?guez، نويسنده , , Enrique Alegre، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2013
Pages :
19
From page :
146
To page :
164
Abstract :
Class distribution estimation (quantification) plays an important role in many practical classification problems. Firstly, it is important in order to adapt the classifier to the operational conditions when they differ from those assumed in learning. Additionally, there are some real domains where the quantification task is itself valuable due to the high variability of the class prior probabilities. Our novel quantification approach for two-class problems is based on distributional divergence measures. The mismatch between the test data distribution and validation distributions generated in a fully controlled way is measured by the Hellinger distance in order to estimate the prior probability that minimizes this divergence. Experimental results on several binary classification problems show the benefits of this approach when compared to such approaches as counting the predicted class labels and other methods based on the classifier confusion matrix or on posterior probability estimations. We also illustrate these techniques as well as their robustness against the base classifier performance (a neural network) with a boar semen quality control setting. Empirical results show that the quantification can be conducted with a mean absolute error lower than 0.008, which seems very promising in this field.
Keywords :
Class prior probability estimation , Quantification , Hellinger distance , Class distribution shift
Journal title :
Information Sciences
Serial Year :
2013
Journal title :
Information Sciences
Record number :
1215263
Link To Document :
بازگشت