Title :
Using the Number of Faults to Improve Fault-Proneness Prediction of the Probability Models
Author :
Li, Lianfa ; Leung, Hareton
Author_Institution :
LREIS, Chinese Acad. of Sci., Beijing, China
fDate :
March 31 2009-April 2 2009
Abstract :
The existing fault-proneness prediction methods are based on unsampling and the training dataset does not contain the information on the number of faults of each module and the fault distributions among these modules. In this paper, we propose an oversampling method using the number of faults to improve fault-proneness prediction. Our method uses the information on the number of faults in the training dataset to support better prediction of fault-proneness. Our test illustrates that the difference between the predictions of oversampling and unsampling is statistically significant and our method can improve the prediction of two probability models, i.e. logistic regression and naive Bayes with kernel estimators.
Keywords :
learning (artificial intelligence); software fault tolerance; statistical distributions; fault distribution; fault-proneness prediction method; oversampling method; probability model; training dataset; Computer science; Data analysis; Distributed computing; Kernel; Logistics; Mechanical variables measurement; Predictive models; Probability; Software measurement; Testing; bugs; fault-proneness prediction; learner; quality assessment; software engineering; statistical analysis;
Conference_Titel :
Computer Science and Information Engineering, 2009 WRI World Congress on
Conference_Location :
Los Angeles, CA
Print_ISBN :
978-0-7695-3507-4
DOI :
10.1109/CSIE.2009.349