Title :
Study on Handling Range Inputs Methods on C4.5 Algorithm
Author :
Jing-ti, Han ; Yu-jia, Gu
Author_Institution :
Sch. of Inf. Manage. & Eng., Shanghai Univ. of Finance & Econ. (SHUFE), Shanghai, China
Abstract :
How to successfully build a decision tree remains a focused topic in data mining. Hitherto many scholars have contributed a lot in the betterment of decision tree building algorithms. However, sometimes dataset may have range input attributes and present decision tree building methods, namely mean substitute, min-max substitute and mean-extent substitute, may not be suitable. This paper combines C4.5 and fuzzy mathematics to put forward a method structure in handling range inputs. The new method has important improvements on membership grade and entropy calculation method. Then a validation of the usefulness of the method is presented. The method is thought to be successfully applied to the investigation methodology, mainly in continuous data inputs with inexact data which consists of maximums and minimums.
Keywords :
data mining; decision trees; entropy; fuzzy set theory; C4.5 algorithm; data mining; decision tree building algorithms; entropy calculation method; forward method structure; fuzzy mathematics; handling range inputs methods; mean extent substitute; min-max substitute; range input attributes; Application software; Buildings; Computer applications; Data mining; Decision trees; Entropy; Finance; Information management; Mathematics; Minimax techniques;
Conference_Titel :
Computer Science-Technology and Applications, 2009. IFCSTA '09. International Forum on
Conference_Location :
Chongqing
Print_ISBN :
978-0-7695-3930-0
Electronic_ISBN :
978-1-4244-5423-5
DOI :
10.1109/IFCSTA.2009.18