شماره ركورد كنفرانس :
4650
عنوان مقاله :
Genetic Algorithm and Fuzzy C-Means for Feature Selection: Based on a Dual Fitness Function
پديدآورندگان :
Amiri Souri Elmira Amirkabir University of Technology , Mohebi Azadeh Iranian Research Institute for Information Science and Technology , Abbas Ahmadi Abbas Amirkabir University of Technology
تعداد صفحه :
9
كليدواژه :
Feature Selection , Genetic Algorithm (GA) , Fuzzy C , Means Algorithm
سال انتشار :
1396
عنوان كنفرانس :
نوزدهمين كنفرانس بين المللي هوش مصنوعي و پردازش سيگنال
زبان مدرك :
انگليسي
چكيده فارسي :
Feature selection is known as an effective approach to overcome computational complexity and information redundancy in high-dimensional data classification and clustering. Selecting best features in unsupervised learning is much harder than supervised learning because we do not have the labels of data that can guide selection algorithms to remove irrelevant and redundant features. In this paper, we propose a new approach for unsupervised feature selection based on Genetic Algorithm as a heuristic search approach and combine it with Fuzzy C-Means algorithm. We propose a dual, multi objective fitness function based on Davies-Bouldin (DB) and Calinski-Harabasz (CH) indexes. We show that these indices do not necessarily have similar behaviors. Thus, rather than simply considering their weighted average as a new fitness function, we propose a new approach to aggregate them based on their tradeoffs. Comparison of the proposed approach with popular feature selection algorithms, across different datasets, indicates the outperformance of the proposed approach for feature selection.
چكيده لاتين :
Feature selection is known as an effective approach to overcome computational complexity and information redundancy in high-dimensional data classification and clustering. Selecting best features in unsupervised learning is much harder than supervised learning because we do not have the labels of data that can guide selection algorithms to remove irrelevant and redundant features. In this paper, we propose a new approach for unsupervised feature selection based on Genetic Algorithm as a heuristic search approach and combine it with Fuzzy C-Means algorithm. We propose a dual, multi objective fitness function based on Davies-Bouldin (DB) and Calinski-Harabasz (CH) indexes. We show that these indices do not necessarily have similar behaviors. Thus, rather than simply considering their weighted average as a new fitness function, we propose a new approach to aggregate them based on their tradeoffs. Comparison of the proposed approach with popular feature selection algorithms, across different datasets, indicates the outperformance of the proposed approach for feature selection.
كشور :
ايران
لينک به اين مدرک :
بازگشت