Title :
Top-down induction of model trees with regression and splitting nodes
Author :
Malerba, Donato ; Esposito, Floriana ; Ceci, Michelangelo ; Appice, Annalisa
Author_Institution :
Dipt. di Inf., Univ. degli Studi, Bari, Italy
fDate :
5/1/2004 12:00:00 AM
Abstract :
Model trees are an extension of regression trees that associate leaves with multiple regression models. In this paper, a method for the data-driven construction of model trees is presented, namely, the stepwise model tree induction (SMOTI) method. Its main characteristic is the induction of trees with two types of nodes: regression nodes, which perform only straight-line regression, and splitting nodes, which partition the feature space. The multiple linear model associated with each leaf is then built stepwise by combining straight-line regressions reported along the path from the root to the leaf. In this way, internal regression nodes contribute to the definition of multiple models and have a "global" effect, while straight-line regressions at leaves have only "local" effects. Experimental results on artificially generated data sets show that SMOTI outperforms two model tree induction systems, M5\´ and RETIS, in accuracy. Results on benchmark data sets used for studies on both regression and model trees show that SMOTI performs better than RETIS in accuracy, while it is not possible to draw statistically significant conclusions on the comparison with M5\´. Model trees induced by SMOTI are generally simple and easily interpretable and their analysis often reveals interesting patterns.
Keywords :
learning by example; regression analysis; trees (mathematics); benchmark data sets; data driven construction; global effect; internal regression nodes; learning by example; multiple linear model; multiple regression models; regression trees; splitting nodes; stepwise model tree induction method; straight line regression; top down induction; Induction generators; Linear regression; Machine learning; Neural networks; Pattern analysis; Piecewise linear approximation; Piecewise linear techniques; Regression tree analysis; Statistics; Tree data structures; Algorithms; Artificial Intelligence; Cluster Analysis; Computer Simulation; Decision Support Techniques; Information Storage and Retrieval; Numerical Analysis, Computer-Assisted; Pattern Recognition, Automated; Reproducibility of Results; Sensitivity and Specificity;
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on
DOI :
10.1109/TPAMI.2004.1273937