Title of article
Core-generating approximate minimum entropy discretization for rough set feature selection in pattern classification Original Research Article
Author/Authors
David Tian، نويسنده , , Xiao-jun Zeng، نويسنده , , John Keane، نويسنده ,
Issue Information
روزنامه با شماره پیاپی سال 2011
Pages
18
From page
863
To page
880
Abstract
Rough set feature selection (RSFS) can be used to improve classifier performance. RSFS removes redundant attributes whilst retaining important ones that preserve the classification power of the original dataset. Reducts are feature subsets selected by RSFS. Core is the intersection of all the reducts of a dataset. RSFS can only handle discrete attributes, hence, continuous attributes need to be discretized before being input to RSFS. Discretization determines the core size of a discrete dataset. However, current discretization methods do not consider the core size during discretization. Earlier work has proposed core-generating approximate minimum entropy discretization (C-GAME) algorithm which selects the maximum number of minimum entropy cuts capable of generating a non-empty core within a discrete dataset. The contributions of this paper are as follows: (1) the C-GAME algorithm is improved by adding a new type of constraint to eliminate the possibility that only a single reduct is present in a C-GAME-discrete dataset; (2) performance evaluation of C-GAME in comparison to C4.5, multi-layer perceptrons, RBF networks and k-nearest neighbours classifiers on ten datasets chosen from the UCI Machine Learning Repository; (3) performance evaluation of C-GAME in comparison to Recursive Minimum Entropy Partition (RMEP), Chimerge, Boolean Reasoning and Equal Frequency discretization algorithms on the ten datasets; (4) evaluation of the effects of C-GAME and the other four discretization methods on the sizes of reducts; (5) an upper bound is defined on the total number of reducts within a dataset; (6) the effects of different discretization algorithms on the total number of reducts are analysed; (7) performance analysis of two RSFS algorithms (a genetic algorithm and Johnson’s algorithm).
Keywords
Pattern classification , Core-generating approximate minimum entropy discretization , Rough set feature selection , Constraint satisfaction optimization problems
Journal title
International Journal of Approximate Reasoning
Serial Year
2011
Journal title
International Journal of Approximate Reasoning
Record number
1183012
Link To Document