• DocumentCode
    2257048
  • Title

    Discretization of continuous-valued attributes in decision tree generation

  • Author

    Li, Wen-Liagn ; Yu, Rui-Hua ; Wang, Xi-Zhao

  • Author_Institution
    Key Lab. of Machine Learning & Comput. Intell., Hebei Univ., Baoding, China
  • Volume
    1
  • fYear
    2010
  • fDate
    11-14 July 2010
  • Firstpage
    194
  • Lastpage
    198
  • Abstract
    Decision tree is one of the most popular and widely used classification models in machine learning. The discretization of continuous-valued attributes plays an important role in decision tree generation. In this paper, we improve Fayyad´s discretization method which uses the average class entropy of candidate partitions to select boundaries for discretization. Our method can reduce the number of candidate boundaries further. Here we also propose a generalized splitting criterion for cut point selection and prove that the cut points are always on boundaries when using this criterion. Along with the formal proof, we present empirical results that the decision trees generated by using such criteria are similar on several datasets from the UCI Machine Learning Repository.
  • Keywords
    data handling; decision trees; entropy; Fayyad discretization; UCI machine learning repository; classification model; continuous-valued attributes; cut point selection; decision tree generation; generalized splitting criterion; Classification tree analysis; Entropy; Impurities; Indexes; Machine learning; Continuous-valued; Decision tree; Discretization; Splitting criterion;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics (ICMLC), 2010 International Conference on
  • Conference_Location
    Qingdao
  • Print_ISBN
    978-1-4244-6526-2
  • Type

    conf

  • DOI
    10.1109/ICMLC.2010.5581069
  • Filename
    5581069