DocumentCode :
151505
Title :
Software maintainability prediction by data mining of software code metrics
Author :
Kaur, Amardeep ; Kaur, Kanwalpreet ; Pathak, K.
Author_Institution :
Comput. Sci. Dept., GGS Indraprastha Univ., New Delhi, India
fYear :
2014
fDate :
5-6 Sept. 2014
Firstpage :
1
Lastpage :
6
Abstract :
Software maintainability is a key quality attribute that determines the success of a software product. Since software maintainability is an important attribute of software quality, accurate prediction of it can help to improve overall software quality. This paper utilizes data mining of some new predictor metrics apart from traditionally used software metrics for predicting maintainability of software systems. The prediction models are constructed using static code metric datasets of four different open source software (OSS): Lucene, JHotdraw, JEdit, and JTreeview. Lucene contain 385 classes and is of 135241 lines of code (LOC) OSS, JHotdraw contain 159 classes and is of 21802 LOC OSS, JEdit contain 275 classes and is of 104053 LOC OSS and JTreeview contain 60 classes and is of 11988 LOC OSS. The metrics were collected using two different metrics extraction tools Chidamber and Kemerer Java metric (CKJM) tool and IntelliJ IDEA. Naïve Bayes, Bayes Network, Logistic, MultiLayerPerceptron and Random Forest classifiers are used to identify the software modules that are difficult to maintain. Random forest models are found to be most useful in software maintainability prediction by data mining of software code metrics as random forest models have higher recall, precision and Area under curve (AUC) of ROC curve.
Keywords :
Bayes methods; data mining; multilayer perceptrons; pattern classification; public domain software; random processes; software maintenance; software metrics; software quality; software reliability; AUC; Bayes network; CKJM tool; Chidamber and Kemerer Java metric tool; IntelliJ IDEA; JEdit; JHotdraw; JTreeview; LOC OSS; Lucene; Naïve Bayes; ROC curve; area under curve; data mining; logistics; metrics extraction tools; multilayer perceptron; open source software; precision; predictor metrics; random forest classifiers; random forest models; recall; software code metrics; software maintainability prediction; software modules; software product; software quality; static code metric datasets; Data mining; Logistics; Object oriented modeling; Predictive models; Software; Software metrics; Data mining; Software maintainability prediction; software code metrics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining and Intelligent Computing (ICDMIC), 2014 International Conference on
Conference_Location :
New Delhi
Print_ISBN :
978-1-4799-4675-4
Type :
conf
DOI :
10.1109/ICDMIC.2014.6954262
Filename :
6954262
Link To Document :
بازگشت