Title :
Software quality modeling: The impact of class noise on the random forest classifier
Author :
Folleco, Andres ; Khoshgoftaar, Taghi M. ; Van Hulse, Jason ; Bullard, Lofton
Author_Institution :
Dept. of Comput. Sci. & Eng., Florida Atlantic Univ., Boca Raton, FL
Abstract :
This study investigates the impact of increasing levels of simulated class noise on software quality classification. Class noise was injected into seven software engineering measurement datasets, and the performance of three learners, random forests, C4.5, and Naive Bayes, was analyzed. The random forest classifier was utilized for this study because of its strong performance relative to well-known and commonly-used classifiers such as C4.5 and Naive Bayes. Further, relatively little prior research in software quality classification has considered the random forest classifier. The experimental factors considered in this study were the level of class noise and the percent of minority instances injected with noise. The empirical results demonstrate that the random forest obtained the best and most consistent classification performance in all experiments.
Keywords :
Bayes methods; pattern classification; software metrics; software quality; C4.5; class noise; naive Bayes; random forest classifier; software quality classification; software quality modeling; Classification algorithms; Machine learning; Noise level; Noise measurement; Noise robustness; Radio frequency; Software algorithms; Software engineering; Software measurement; Software quality;
Conference_Titel :
Evolutionary Computation, 2008. CEC 2008. (IEEE World Congress on Computational Intelligence). IEEE Congress on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-1822-0
Electronic_ISBN :
978-1-4244-1823-7
DOI :
10.1109/CEC.2008.4631321