Title :
Efficient determination of dynamic split points in a decision tree
Author :
Chickering, David Maxwell ; Meek, Christopher ; Rounthwaite, Robert
Author_Institution :
Microsoft Corp., Redmond, WA, USA
Abstract :
We consider the problem of choosing split points for continuous predictor variables in a decision tree. Previous approaches to this problem typically either: (1) discretize the continuous predictor values prior to learning, or (2) apply a dynamic method that considers all possible split points for each potential split. We describe a number of alternative approaches that generate a small number of candidate split points dynamically with little overhead. We argue that these approaches are preferable to pre-discretization, and provide experimental evidence that they yield probabilistic decision trees with the same prediction accuracy as the traditional dynamic approach. Furthermore, because the time to grow a decision tree is proportional to the number of split points evaluated, our approach is significantly faster than the traditional dynamic approach
Keywords :
data analysis; decision trees; learning (artificial intelligence); probability; candidate split points; continuous predictor value discretization; continuous predictor variables; decision tree; dynamic approach; dynamic method; dynamic split point determination; potential split; probabilistic decision trees; split points; traditional dynamic approach; Bayesian methods; Classification tree analysis; Context modeling; Decision trees; Degradation; Heuristic algorithms; Learning systems; Prediction algorithms; Probability distribution; Testing;
Conference_Titel :
Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on
Conference_Location :
San Jose, CA
Print_ISBN :
0-7695-1119-8
DOI :
10.1109/ICDM.2001.989505