• DocumentCode
    1331268
  • Title

    Use of contextual information for feature ranking and discretization

  • Author

    Hong, Se June

  • Author_Institution
    Div. of Res., IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
  • Volume
    9
  • Issue
    5
  • fYear
    1997
  • Firstpage
    718
  • Lastpage
    730
  • Abstract
    Deriving classification rules or decision trees from examples is an important problem. When there are too many features, discarding weak features before the derivation process is highly desirable. When there are numeric features, they need to be discretized for the rule generation. We present a new approach to these problems. Traditional techniques make use of feature merits based on either the information theoretic, or the statistical correlation between each feature and the class. We instead assign merits to features by finding each feature´s “obligation” to the class discrimination in the context of other features. The merits are then used to rank the features, select a feature subset, and discretize the numeric variables. Experience with benchmark example sets demonstrates that the new approach is a powerful alternative to the traditional methods. This paper concludes by posing some new technical issues that arise from this approach
  • Keywords
    feature extraction; knowledge based systems; benchmark example sets; classification rules; contextual information; decision trees from examples; feature discretization; feature ranking; numeric features; rule generation; statistical correlation; Artificial neural networks; Character generation; Classification tree analysis; Decision trees; Degradation; Manufacturing processes; Multidimensional systems; Robustness; Testing; Text categorization;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/69.634751
  • Filename
    634751