مرکز منطقه ای اطلاع رساني علوم و فناوري - Classifiability-based omnivariate decision trees

DocumentCode :

1209231

Title :

Classifiability-based omnivariate decision trees

Author :

Li, Yuanhong ; Dong, Ming ; Kothari, Ravi

Author_Institution :

Dept. of Comput. Sci., Wayne State Univ., Detroit, MI, USA

Volume :

Issue :

fYear :

2005

Firstpage :

1547

Lastpage :

1560

Abstract :

Top-down induction of decision trees is a simple and powerful method of pattern classification. In a decision tree, each node partitions the available patterns into two or more sets. New nodes are created to handle each of the resulting partitions and the process continues. A node is considered terminal if it satisfies some stopping criteria (for example, purity, i.e., all patterns at the node are from a single class). Decision trees may be univariate, linear multivariate, or nonlinear multivariate depending on whether a single attribute, a linear function of all the attributes, or a nonlinear function of all the attributes is used for the partitioning at each node of the decision tree. Though nonlinear multivariate decision trees are the most powerful, they are more susceptible to the risks of overfitting. In this paper, we propose to perform model selection at each decision node to build omnivariate decision trees. The model selection is done using a novel classifiability measure that captures the possible sources of misclassification with relative ease and is able to accurately reflect the complexity of the subproblem at each node. The proposed approach is fast and does not suffer from as high a computational burden as that incurred by typical model selection algorithms. Empirical results over 26 data sets indicate that our approach is faster and achieves better classification accuracy compared to statistical model select algorithms.

Keywords :

computational complexity; decision diagrams; decision trees; model-based reasoning; pattern classification; set theory; Bayes error; classifiability-based omnivariate decision trees; classification accuracy; data complexity; data density; decision boundary; decision node; linear function; linear multivariate; misclassification; model selection algorithms; node partitioning; node partitions; nonlinear function; nonlinear multivariate decision trees; pattern classification; statistical model select algorithms; univariate; Classification tree analysis; Computer science; Decision trees; Laboratories; Machine vision; Neural networks; Partitioning algorithms; Pattern classification; Pattern recognition; Power generation; Bayes error; data complexity; data density; decision boundary; omnivariate decision trees; Algorithms; Artificial Intelligence; Computer Simulation; Decision Making, Computer-Assisted; Decision Support Techniques; Models, Theoretical; Pattern Recognition, Automated;

fLanguage :

English

Journal_Title :

Neural Networks, IEEE Transactions on

Publisher :

ieee

ISSN :

1045-9227

Type :

jour

DOI :

10.1109/TNN.2005.852864

Filename :

1528531

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1209231