DocumentCode :
1961716
Title :
CMP: a fast decision tree classifier using multivariate predictions
Author :
Wang, Haixun ; Zaniolo, Carlo
Author_Institution :
Dept. of Comput. Sci., California Univ., Los Angeles, CA, USA
fYear :
2000
fDate :
2000
Firstpage :
449
Lastpage :
460
Abstract :
Most decision tree classifiers are designed to keep class histograms for single attributes, and to select a particular attribute for the next split using said histograms. We propose a technique where, by keeping histograms on attribute pairs, we achieve: a significant speed-up over traditional classifiers based on single attribute splitting; and the ability of building classifiers that use linear combinations of values from non-categorical attribute pairs as split criterion. Indeed, by keeping two-dimensional histograms, CMP can often predict the best successive split, in addition to computing the current one; therefore, CMP is normally able to grow more than one level of a decision tree for each data scan. CMP´s performance improvements are also due to techniques whereby non-categorical attributes are discretized without loss in classification accuracy; in fact, we introduce simple techniques, whereby classification errors caused by discretization at one step can then be corrected in the following step. In summary, CMP represents a unified algorithm that extends the functionality of existing classifiers and improves their performance
Keywords :
classification; data mining; database theory; decision trees; software performance evaluation; very large databases; CMP; attribute pairs; class histograms; data mining; fast decision tree classifier; multivariate predictions; performance improvements; single attribute splitting; Classification tree analysis; Data mining; Databases; Decision trees; Ear; Genetics; Histograms; Machine learning; Read only memory; Statistics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2000. Proceedings. 16th International Conference on
Conference_Location :
San Diego, CA
ISSN :
1063-6382
Print_ISBN :
0-7695-0506-6
Type :
conf
DOI :
10.1109/ICDE.2000.839444
Filename :
839444
Link To Document :
بازگشت