Title :
Systematic Labeling Bias: De-biasing Where Everyone is Wrong
Author :
Cabrera, G.F. ; Miller, C.J. ; Schneider, J.
Author_Institution :
Dept. of Comput. Sci., Univ. of Chile, Santiago, Chile
Abstract :
Many real world classification problems use ground truth labels created by human annotators. However, observed data is never perfect, and even labels assigned by perfect annotators can be systematically biased due to poor quality of the data they are labeling. This bias is not created by the annotators from measurement error, but is intrinsic to the observational data. We present a method for de-biasing labels which simultaneously learns a classification model, estimates the intrinsic biases in the ground truth, and provides new de-biased labels. We test our algorithm on simulated and real data and show that it is superior to standard denoising algorithms, like instance weighted logistic regression.
Keywords :
estimation theory; image classification; image denoising; learning (artificial intelligence); regression analysis; classification model learning; data quality; denoising algorithms; intrinsic bias estimation; logistic regression; systematic labeling bias; Accuracy; Gold; Labeling; Logistics; Spirals; Standards; Training;
Conference_Titel :
Pattern Recognition (ICPR), 2014 22nd International Conference on
Conference_Location :
Stockholm
DOI :
10.1109/ICPR.2014.756