Title :
A Top-Down and Greedy Method for Discretization of Continuous Attributes
Author :
Lee, Chien-I ; Tsai, Cheng-Jung ; Yang, Ya-Ru ; Yang, Wei-Pang
Author_Institution :
Nat. Univ. of Tainan, Tainan
Abstract :
Experiments show that CAIM discretization algorithm is superior to all the other top-down discretization algorithms. However, CAIM algorithm does not take the data distribution into account. The discretization formula used in CAIM also gives a high factor to the numbers of generated intervals. The two disadvantages make CAIM may generate irrational discrete results in some cases and further leads to the decrease of predictive accuracy of a classifier. In this paper we propose the class-attribute contingency coefficient discretization algorithm. The experimental results showed that compared with CAIM, our method can generate a better discretization scheme to bring on the improvement of accuracy of classification. With regard to the number of generated rules and execution time of a classifier, CACC and CAIM achieve comparable results.
Keywords :
data handling; data mining; optimisation; pattern classification; CAIM discretization; classifier; continuous attributes; data distribution; top-down discretization; Accuracy; Classification algorithms; Computational complexity; Data mining; Design automation; Entropy; Frequency; Merging; Prediction algorithms; Technology management;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on
Conference_Location :
Haikou
Print_ISBN :
978-0-7695-2874-8
DOI :
10.1109/FSKD.2007.129