DocumentCode
120677
Title
Multi-objective genetic algorithm approach to feature subset optimization
Author
Saroj ; Jyoti
Author_Institution
Dept. of Comput. Sci. & Eng., Guru Jambheshwar Univ. of Sci. & Technol., Hisar, India
fYear
2014
fDate
21-22 Feb. 2014
Firstpage
544
Lastpage
548
Abstract
The presence of unimportant and superfluous features in datasets motivates researchers to devise novel feature selection strategies. The problem of feature selection is multi-objective in nature and hence optimizing feature subsets with respect to any single evaluation criteria is not sufficient [1]. Moreover, discovering a single best subset of features is not of much interest. In fact, finding several feature subsets reflecting a trade off among several objective criteria is more beneficial as it provides the users a broad choice for feature subset selection. Thus, in order to combine several feature selection criteria, we propose multi-objective optimization of feature subsets using Multi-Objective Genetic Algorithm. This work is an attempt to discover non-dominated feature subsets of smaller cardinality with high predictive power and least redundancy. To meet this purpose we have used NSGA II, a well known Multi-objective Genetic Algorithm (MOGA), for discovering non-dominated feature subsets for the task of classification. The main contribution of this paper is the design of a novel multi-objective fitness function consisting of information gain, mutual correlation and size of the feature subset as the multi-optimization criteria. The suggested approach is validated on seven datasets from the UCI machine learning repository. Support Vector Machine, a well tested classification algorithm is used to measure the classification accuracy. The results confirm that the proposed system is able to discover diverse optimal feature subsets that are well spread in the overall feature space and the classification accuracy of the resulting feature subsets is reasonably high.
Keywords
correlation methods; feature selection; genetic algorithms; pattern classification; support vector machines; MOGA; NSGA II; UCI machine learning repository; cardinality; classification accuracy; classification algorithm; feature selection criteria; feature selection strategies; feature space; feature subset optimization; feature subset selection; feature subset size; information gain; multiobjective fitness function; multiobjective genetic algorithm; multiobjective optimization; multioptimization criteria; mutual correlation; nondominated feature subsets; objective criteria; predictive power; support vector machine; Accuracy; Classification algorithms; Correlation; Filtering algorithms; Genetic algorithms; Linear programming; Optimization; Feature subset selection; Multi-Objective Genetic Algorithm; Multi-objective optimization; Non-dominated solutions;
fLanguage
English
Publisher
ieee
Conference_Titel
Advance Computing Conference (IACC), 2014 IEEE International
Conference_Location
Gurgaon
Print_ISBN
978-1-4799-2571-1
Type
conf
DOI
10.1109/IAdCC.2014.6779383
Filename
6779383
Link To Document