DocumentCode
891352
Title
Decision tree design from a communication theory standpoint
Author
Goodman, Rodney M. ; Smyth, Padhraic
Author_Institution
Dept. of Electr. Eng., California Inst. of Technol., Pasadena, CA, USA
Volume
34
Issue
5
fYear
1988
fDate
9/1/1988 12:00:00 AM
Firstpage
979
Lastpage
994
Abstract
A communication theory approach to decision tree design based on a top-town mutual information algorithm is presented. It is shown that this algorithm is equivalent to a form of Shannon-Fano prefix coding, and several fundamental bounds relating decision-tree parameters are derived. The bounds are used in conjunction with a rate-distortion interpretation of tree design to explain several phenomena previously observed in practical decision-tree design. A termination rule for the algorithm called the delta-entropy rule is proposed that improves its robustness in the presence of noise. Simulation results are presented, showing that the tree classifiers derived by the algorithm compare favourably to the single nearest neighbour classifier
Keywords
decision theory; encoding; information theory; trees (mathematics); Shannon-Fano prefix coding; communication theory; decision tree design; delta-entropy rule; rate-distortion interpretation; single nearest neighbour classifier; termination rule; top-town mutual information algorithm; Algorithm design and analysis; Classification tree analysis; Decision trees; Expert systems; Mutual information; Nearest neighbor searches; Noise robustness; Pattern recognition; Rate-distortion; Space technology;
fLanguage
English
Journal_Title
Information Theory, IEEE Transactions on
Publisher
ieee
ISSN
0018-9448
Type
jour
DOI
10.1109/18.21221
Filename
21221
Link To Document