DocumentCode
2257048
Title
Discretization of continuous-valued attributes in decision tree generation
Author
Li, Wen-Liagn ; Yu, Rui-Hua ; Wang, Xi-Zhao
Author_Institution
Key Lab. of Machine Learning & Comput. Intell., Hebei Univ., Baoding, China
Volume
1
fYear
2010
fDate
11-14 July 2010
Firstpage
194
Lastpage
198
Abstract
Decision tree is one of the most popular and widely used classification models in machine learning. The discretization of continuous-valued attributes plays an important role in decision tree generation. In this paper, we improve Fayyad´s discretization method which uses the average class entropy of candidate partitions to select boundaries for discretization. Our method can reduce the number of candidate boundaries further. Here we also propose a generalized splitting criterion for cut point selection and prove that the cut points are always on boundaries when using this criterion. Along with the formal proof, we present empirical results that the decision trees generated by using such criteria are similar on several datasets from the UCI Machine Learning Repository.
Keywords
data handling; decision trees; entropy; Fayyad discretization; UCI machine learning repository; classification model; continuous-valued attributes; cut point selection; decision tree generation; generalized splitting criterion; Classification tree analysis; Entropy; Impurities; Indexes; Machine learning; Continuous-valued; Decision tree; Discretization; Splitting criterion;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics (ICMLC), 2010 International Conference on
Conference_Location
Qingdao
Print_ISBN
978-1-4244-6526-2
Type
conf
DOI
10.1109/ICMLC.2010.5581069
Filename
5581069
Link To Document