• Title of article

    Core-generating approximate minimum entropy discretization for rough set feature selection in pattern classification Original Research Article

  • Author/Authors

    David Tian، نويسنده , , Xiao-jun Zeng، نويسنده , , John Keane، نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2011
  • Pages
    18
  • From page
    863
  • To page
    880
  • Abstract
    Rough set feature selection (RSFS) can be used to improve classifier performance. RSFS removes redundant attributes whilst retaining important ones that preserve the classification power of the original dataset. Reducts are feature subsets selected by RSFS. Core is the intersection of all the reducts of a dataset. RSFS can only handle discrete attributes, hence, continuous attributes need to be discretized before being input to RSFS. Discretization determines the core size of a discrete dataset. However, current discretization methods do not consider the core size during discretization. Earlier work has proposed core-generating approximate minimum entropy discretization (C-GAME) algorithm which selects the maximum number of minimum entropy cuts capable of generating a non-empty core within a discrete dataset. The contributions of this paper are as follows: (1) the C-GAME algorithm is improved by adding a new type of constraint to eliminate the possibility that only a single reduct is present in a C-GAME-discrete dataset; (2) performance evaluation of C-GAME in comparison to C4.5, multi-layer perceptrons, RBF networks and k-nearest neighbours classifiers on ten datasets chosen from the UCI Machine Learning Repository; (3) performance evaluation of C-GAME in comparison to Recursive Minimum Entropy Partition (RMEP), Chimerge, Boolean Reasoning and Equal Frequency discretization algorithms on the ten datasets; (4) evaluation of the effects of C-GAME and the other four discretization methods on the sizes of reducts; (5) an upper bound is defined on the total number of reducts within a dataset; (6) the effects of different discretization algorithms on the total number of reducts are analysed; (7) performance analysis of two RSFS algorithms (a genetic algorithm and Johnson’s algorithm).
  • Keywords
    Pattern classification , Core-generating approximate minimum entropy discretization , Rough set feature selection , Constraint satisfaction optimization problems
  • Journal title
    International Journal of Approximate Reasoning
  • Serial Year
    2011
  • Journal title
    International Journal of Approximate Reasoning
  • Record number

    1183012