DocumentCode :
3497186
Title :
Improve the quality of supervised discretization of continuous valued attributes in data mining
Author :
Farid, Dewan Md
Author_Institution :
Dept. of Comput. Sci. & Eng., Jahangirnagar Univ., Dhaka, Bangladesh
fYear :
2011
fDate :
22-24 Dec. 2011
Firstpage :
61
Lastpage :
64
Abstract :
Dealing with continuous-valued attributes is an important data mining problem that has effects on accuracy, complexity, and understandability of the mining algorithms. This paper presents a new approach for dealing with continuous attributes that improve the quality of discretization as a preprocessing step for decision tree and naïve Bayesian classifier. The proposed approach focus on supervised discretization, however, unsupervised discretization can also be applied in the same way. It finds the possible cut points with the attribute values of continuous attribute that can separate the class distributions, and then consider the best cut point as an interval border with information gain heuristic and Bayesian classifier. The proposed approach has been tested by comparing with other discretization methods on a number of benchmark problems from UCI machine learning repository. The experimental results proved that the proposed approach for discretization of continuous attributes improves the quality of discretization.
Keywords :
Bayes methods; data mining; decision trees; pattern classification; UCI machine learning repository; class distribution separation; continuous valued attributes; data mining; decision tree; information gain heuristic; interval border; mining algorithm accuracy; mining algorithm complexity; mining algorithm understandability; naïve Bayesian classifier; supervised discretization quality; unsupervised discretization; Bayesian Classifier; Cut Points; Information Gain; Interval Border;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer and Information Technology (ICCIT), 2011 14th International Conference on
Conference_Location :
Dhaka
Print_ISBN :
978-1-61284-907-2
Type :
conf
DOI :
10.1109/ICCITechn.2011.6164874
Filename :
6164874
Link To Document :
بازگشت