Title :
Predicting fault-prone software modules using feature selection and classification through data mining algorithms
Author :
Ramani, R. Geetha ; Kumar, Sujay V. ; Jacob, Shomona Gracia
Author_Institution :
Dept. of Inf. Sci. & Technol., Anna Univ., Chennai, India
Abstract :
Software defect detection has been an important topic of research in the field of software engineering for more than a decade. This research work aims to evaluate the performance of supervised machine learning techniques on predicting defective software through data mining algorithms. This paper places emphasis on the performance of classification algorithms in categorizing seven datasets (CM1, JM1, MW1, KC3, PC1, PC2, PC3 and PC4) under two classes namely Defective and Normal. In this study, publicly available data sets from different organizations are used. This permitted us to explore the impact of data from different sources on different processes for finding appropriate classification models. We propose a computational framework using data mining techniques to detect the existence of defects in software components. The framework comprises of data pre-processing, data classification and classifier evaluation. In this paper; we report the performance of twenty classification algorithms on seven publicly available datasets from the NASA MDP Repository. Random Tree Classification algorithm produced 100 percent accuracy in classifying the datasets and hence the features selected by this technique were considered to be the most significant features. The results were validated with suitable test data.
Keywords :
data mining; learning (artificial intelligence); pattern classification; software fault tolerance; CM1 dataset; JM1 dataset; KC3 dataset; MW1 dataset; NASA MDP Repository; PC1 dataset; PC2 dataset; PC3 dataset; PC4 dataset; classification algorithm; data mining algorithm; defective dataset class; fault-prone software module prediction; feature classification; feature selection; normal dataset class; random tree classification algorithm; software defect detection; software engineering; supervised machine learning; Classification; Data Mining; Feature Selection; Machine learning;
Conference_Titel :
Computational Intelligence & Computing Research (ICCIC), 2012 IEEE International Conference on
Conference_Location :
Coimbatore
Print_ISBN :
978-1-4673-1342-1
DOI :
10.1109/ICCIC.2012.6510294