DocumentCode :
3123590
Title :
Wrapper-Based Feature Ranking for Software Engineering Metrics
Author :
Altidor, Wilker ; Khoshgoftaar, Taghi M. ; Napolitano, Amri
Author_Institution :
Florida Atlantic Univ., Boca Raton, FL, USA
fYear :
2009
fDate :
13-15 Dec. 2009
Firstpage :
241
Lastpage :
246
Abstract :
The application of feature ranking to software engineering datasets is rare at best. In this study, we consider wrapper-based feature ranking where nine performance metrics aided by a particular learner are evaluated. We consider five learners and take two different approaches, each in conjunction with one of two different methodologies: 3-fold Cross-Validation (CV) and 3-fold Cross-Validation Risk Impact (CV-R). The classifiers are Naive Bayes (NB), Multi Layer Perceptron (MLP), k- Nearest Neighbors (kNN), Support Vector Machines (SVM), and Logistic Regression (LR). The performance metrics used as ranking techniques are Overall Accuracy (OA), F-Measure(FM), Geometric Mean (GM), Arithmetic Mean (AM), Area under ROC (AUC), Area under PRC (PRC), Best F-Measure (BFM), Best Geometric Mean (BGM), and Best Arithmetic Mean (BAM). To evaluate the classifier performance after feature selection has been applied, we use AUC as the performance evaluator. This paper represents a preliminary report on our proposed wrapper-based feature ranking approach to software defect prediction problems.
Keywords :
Bayes methods; feature extraction; multilayer perceptrons; pattern classification; regression analysis; software metrics; software performance evaluation; support vector machines; 3- fold cross validation risk impact; 3-fold cross validation; Naive Bayes classifier; area under PRC; area under ROC; best F-measure; best arithmetic mean; best geometric mean; k-nearest neighbors classifier; logistic regression; multilayer perceptron classifier; overall accuracy; performance evaluation; software engineering metrics; support vector machine; wrapper based feature ranking; Application software; Arithmetic; Logistics; Measurement; Nearest neighbor searches; Niobium; Partial response channels; Software engineering; Support vector machine classification; Support vector machines; feature selection; performance metrics; software engineering metrics; wrapper-based feature ranking;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications, 2009. ICMLA '09. International Conference on
Conference_Location :
Miami Beach, FL
Print_ISBN :
978-0-7695-3926-3
Type :
conf
DOI :
10.1109/ICMLA.2009.17
Filename :
5381847
Link To Document :
بازگشت