Title :
The Impact of Mislabelling on the Performance and Interpretation of Defect Prediction Models
Author :
Tantithamthavorn, Chakkrit ; McIntosh, Shane ; Hassan, Ahmed E. ; Ihara, Akinori ; Matsumoto, Kenichi
Author_Institution :
Grad. Sch. of Inf. Sci., Nara Inst. of Sci. & Technol., Nara, Japan
Abstract :
The reliability of a prediction model depends on the quality of the data from which it was trained. Therefore, defect prediction models may be unreliable if they are trained using noisy data. Recent research suggests that randomly-injected noise that changes the classification (label) of software modules from defective to clean (and vice versa) can impact the performance of defect models. Yet, in reality, incorrectly labelled (i.e., mislabelled) issue reports are likely non-random. In this paper, we study whether mislabelling is random, and the impact that realistic mislabelling has on the performance and interpretation of defect models. Through a case study of 3,931 manually-curated issue reports from the Apache Jackrabbit and Lucene systems, we find that: (1) issue report mislabelling is not random; (2) precision is rarely impacted by mislabelled issue reports, suggesting that practitioners can rely on the accuracy of modules labelled as defective by models that are trained using noisy data; (3) however, models trained on noisy data typically achieve 56%-68% of the recall of models trained on clean data; and (4) only the metrics in top influence rank of our defect models are robust to the noise introduced by mislabelling, suggesting that the less influential metrics of models that are trained on noisy data should not be interpreted or used to make decisions.
Keywords :
software performance evaluation; software reliability; Apache Jackrabbit system; Lucene system; defect prediction model interpretation; defect prediction model performance; defect prediction models; mislabelling impact; prediction model reliability; randomly-injected noise; software modules; Data mining; Data models; Noise; Noise measurement; Predictive models; Software; Data Quality; Mislabelling; Software Defect Prediction; Software Quality Assurance;
Conference_Titel :
Software Engineering (ICSE), 2015 IEEE/ACM 37th IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICSE.2015.93