DocumentCode :
2924031
Title :
Decision Trees for Probability Estimation: An Empirical Study
Author :
Liang, Han ; Zhang, Harry ; Yan, Yuhong
Author_Institution :
Fac. of Comput. Sci.,, New Brunswick Univ., Fredericton, NB
fYear :
2006
fDate :
Nov. 2006
Firstpage :
756
Lastpage :
764
Abstract :
Accurate probability estimation generated by learning models is desirable in some practical applications, such as medical diagnosis. In this paper, we empirically study traditional decision-tree learning models and their variants in terms of probability estimation, measured by conditional log likelihood (CLL). Furthermore, we also compare decision tree learning with other kinds of representative learning: Naive Bayes, Naive Bayes tree, Bayesian network, K-nearest neighbors and support vector machine with respect to probability estimation. From our experiments, we have several interesting observations. First, among various decision-tree learning models, C4.4 is the best in yielding precise probability estimation measured by CLL, although its performance is not good in terms of other evaluation criteria, such as accuracy and ranking. We provide an explanation for this and reveal the nature of CLL. Second, compared with other popular models, C4.4 achieves the best CLL. Finally, CLL does not dominate another well-established relevant measurement AUC (the area under the curve of receiver operating characteristics), which suggests that different decision-tree learning models should be used for different objectives. Our experiments are conducted on the basis of 36 UCI sample sets that cover a wide range of domains and data characteristics. We run all the models within a machine learning platform $Weka
Keywords :
Bayes methods; belief networks; decision trees; learning (artificial intelligence); probability; support vector machines; Bayesian network; K-nearest neighbors; Naive Bayes tree; Weka; conditional log likelihood; decision trees; learning models; medical diagnosis; probability estimation; support vector machine; Application software; Computer science; Cost function; Decision trees; Equations; Machine learning; Medical diagnosis; Niobium; Support vector machines; Yield estimation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence, 2006. ICTAI '06. 18th IEEE International Conference on
Conference_Location :
Arlington, VA
ISSN :
1082-3409
Print_ISBN :
0-7695-2728-0
Type :
conf
DOI :
10.1109/ICTAI.2006.49
Filename :
4031970
Link To Document :
بازگشت