• DocumentCode
    3483425
  • Title

    Decision tree decomposition-based complex feature selection for text chunking

  • Author

    Hwang, Young-Sook ; Rim, Hae-Chang

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Korea Univ., Seoul, South Korea
  • Volume
    5
  • fYear
    2002
  • fDate
    18-22 Nov. 2002
  • Firstpage
    2217
  • Abstract
    Incorporating a method of feature selection into a classification model often provides a number of advantages. In this paper we propose a new feature selection method based on the discriminative perspective of improving the classification accuracy. The feature selection method is developed for a classification model for text chunking. For effective feature selection, we utilize a decision tree as an intermediate feature space inducer. To select a more compact feature set with less computational load, we organized a partially ordered feature space according to the IGR distribution of features. Experimental results show that: (1) the computational complexity on high-dimensional feature space can be reduced by selecting features based on the decision tree decomposition; (2) the text chunking system using the proposed feature selection can significantly improve the performance compared with a decision tree classifier.
  • Keywords
    decision trees; pattern classification; text analysis; IGR distribution; classification accuracy; classification model; compact feature set; computational complexity; decision tree; decision tree decomposition; decision tree decomposition-based complex feature selection; discriminative perspective; high-dimensional feature space; intermediate feature space inducer; partially ordered feature space; text chunking; Atomic measurements; Computational complexity; Computer science; Decision trees; Distributed computing; Entropy; Extraterrestrial measurements; Humans; Probability distribution; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Information Processing, 2002. ICONIP '02. Proceedings of the 9th International Conference on
  • Print_ISBN
    981-04-7524-1
  • Type

    conf

  • DOI
    10.1109/ICONIP.2002.1201887
  • Filename
    1201887