شماره ركورد كنفرانس :
3297
عنوان مقاله :
Genetic Algorithm and Fuzzy C-Means for Feature Selection: Based on a Dual Fitness Function
عنوان به زبان ديگر :
Genetic Algorithm and Fuzzy C-Means for Feature Selection: Based on a Dual Fitness Function
پديدآورندگان :
Amiri Souri Elmira Dept. of Computer & Information Technology Engineering Amirkabir University of Technology Tehran - Iran , Ahmadi Abbas Dept. of Computer & Information Technology Engineering Amirkabir University of Technology Tehran - Iran , Mohebi Azadeh Iranian Research Institute for Information Science and Technology (IRANDOC) Tehran - Iran
كليدواژه :
Fuzzy C-Means Algorithm , (Genetic Algorithm (GA , Feature Selection
عنوان كنفرانس :
نوزدهمين سمپوزيوم بين المللي هوش مصنوعي و پردازش سيگنال
چكيده لاتين :
Feature selection is known as an effective approach to overcome computational complexity and information redundancy in high-dimensional data classification and clustering. Selecting best features in unsupervised learning is much harder than supervised learning because we do not have the labels of data that can guide selection algorithms to remove irrelevant and redundant features. In this paper, we propose a new approach for unsupervised feature selection based on Genetic Algorithm as a heuristic search approach and combine it with Fuzzy C-Means algorithm. We propose a dual, multi objective fitness function based on Davies-Bouldin (DB) and Calinski-Harabasz (CH) indexes. We show that these indices do not necessarily have similar behaviors. Thus, rather than simply considering their weighted average as a new fitness function, we propose a new approach to aggregate them based on their trade-offs. Comparison of the proposed approach with popular feature selection algorithms, across different datasets, indicates the outperformance of the proposed approach for feature selection.