Title :
OVFDT with functional tree leaf — Majority class, naive Bayes and adaptive hybrid integrations
Author :
Yang, Hang ; Fong, Simon
Author_Institution :
Fac. of Sci. & Technol., Univ. of Macau, Macau, China
Abstract :
Very Fast Decision Tree (VFDT) is an exemplar of classification techniques in data stream mining where models are built by incremental learning from continuously arriving data instead of batches. Many variations and modifications were made upon VFDT since it was first introduced in year 2000. Novel contributions were mainly made in two aspects of VFDT, tree induction process and prediction process, for the sake of improving its prediction accuracy. The basic concept of inducing a VFDT is to use fresh instances from data stream for recursively replacing leaves with decision nodes. This standard version of VFDT therefore simply predicts or classifies new instance by the distribution counts of the past samples at the leaves. Gama et al, extended VFDT to VFDTNB, by installing a naive Bayes classifier at each leaf in the prediction process, so that prior probabilities as referenced from attribute-values at the leaves can be used to refine the prediction accuracy. They called this technique in general, Functional Tree Leaf. Recently a new version of VFDT called Optimized-VFDT or OVFDT has been proposed by the authors that achieve relatively good prediction accuracy and compact tree sizes by controlling the node-splitting and pruning in the tree induction process. Naturally these two types of enhanced algorithms, OVFDT and Functional Tree Leaf, which are both based on incremental learning, can be integrated together in each respective process, like two sides of a hand, for further performance improvement. Our paper reports about this integration and the experimental results.
Keywords :
data mining; decision trees; learning (artificial intelligence); pattern classification; classification technique; data stream mining; decision node; functional tree leaf; incremental learning; optimized very fast decision tree; tree induction process; tree prediction process; Accuracy; Data mining; Decision trees; Estimation; Prediction algorithms; Probability; Testing;
Conference_Titel :
Data Mining and Intelligent Information Technology Applications (ICMiA), 2011 3rd International Conference on
Conference_Location :
Macao
Print_ISBN :
978-1-4673-0231-9