• DocumentCode
    891352
  • Title

    Decision tree design from a communication theory standpoint

  • Author

    Goodman, Rodney M. ; Smyth, Padhraic

  • Author_Institution
    Dept. of Electr. Eng., California Inst. of Technol., Pasadena, CA, USA
  • Volume
    34
  • Issue
    5
  • fYear
    1988
  • fDate
    9/1/1988 12:00:00 AM
  • Firstpage
    979
  • Lastpage
    994
  • Abstract
    A communication theory approach to decision tree design based on a top-town mutual information algorithm is presented. It is shown that this algorithm is equivalent to a form of Shannon-Fano prefix coding, and several fundamental bounds relating decision-tree parameters are derived. The bounds are used in conjunction with a rate-distortion interpretation of tree design to explain several phenomena previously observed in practical decision-tree design. A termination rule for the algorithm called the delta-entropy rule is proposed that improves its robustness in the presence of noise. Simulation results are presented, showing that the tree classifiers derived by the algorithm compare favourably to the single nearest neighbour classifier
  • Keywords
    decision theory; encoding; information theory; trees (mathematics); Shannon-Fano prefix coding; communication theory; decision tree design; delta-entropy rule; rate-distortion interpretation; single nearest neighbour classifier; termination rule; top-town mutual information algorithm; Algorithm design and analysis; Classification tree analysis; Decision trees; Expert systems; Mutual information; Nearest neighbor searches; Noise robustness; Pattern recognition; Rate-distortion; Space technology;
  • fLanguage
    English
  • Journal_Title
    Information Theory, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9448
  • Type

    jour

  • DOI
    10.1109/18.21221
  • Filename
    21221