DocumentCode :
2514125
Title :
Mining Static Code Metrics for a Robust Prediction of Software Defect-Proneness
Author :
Li, Lianfa ; Leung, Hareton
Author_Institution :
LREIS, Inst. of Geogr. Sci. & Natural Resources Res., Beijing, China
fYear :
2011
fDate :
22-23 Sept. 2011
Firstpage :
207
Lastpage :
214
Abstract :
Defect-proneness prediction is affected by multiple aspects including sampling bias, non-metric factors, uncertainty of models etc. These aspects often contribute to prediction uncertainty and result in variance of prediction. This paper proposes two methods of data mining static code metrics to enhance defect-proneness prediction. Given little non-metric or qualitative information extracted from software codes, we first suggest to use a robust unsupervised learning method, shared nearest neighbors (SNN) to extract the similarity patterns of the code metrics. These patterns indicate similar characteristics of the components of the same cluster that may result in introduction of similar defects. Using the similarity patterns with code metrics as predictors, defect-proneness prediction may be improved. The second method uses the Occam´s windows and Bayesian model averaging to deal with model uncertainty: first, the datasets are used to train and cross-validate multiple learners and then highly qualified models are selected and integrated into a robust prediction. From a study based on 12 datasets from NASA, we conclude that our proposed solutions can contribute to a better defect-proneness prediction.
Keywords :
Bayes methods; data mining; software metrics; unsupervised learning; Bayesian model; Occam windows; SNN; data mining; mining static code metrics; nonmetric information; qualitative information; robust prediction; shared nearest neighbors; software codes; software defect proneness; unsupervised learning method; Clustering algorithms; Data mining; Measurement; Predictive models; Robustness; Uncertainty; Unsupervised learning; data mining; defect-proneness; robust prediction; software quality; uncertainty;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Empirical Software Engineering and Measurement (ESEM), 2011 International Symposium on
Conference_Location :
Banff, AB
ISSN :
1938-6451
Print_ISBN :
978-1-4577-2203-5
Type :
conf
DOI :
10.1109/ESEM.2011.29
Filename :
6092569
Link To Document :
بازگشت