DocumentCode :
3292169
Title :
An Efficient Decision Tree Classification Method Based on Extended Hash Table for Data Streams Mining
Author :
Ouyang, Zhenzheng ; Wu, Quanyuan ; Wang, Tao
Author_Institution :
Zhenzheng Ouyang Sci. Sch., Nat. Univ. of Defense Technol., Changsha
Volume :
5
fYear :
2008
fDate :
18-20 Oct. 2008
Firstpage :
313
Lastpage :
317
Abstract :
This paper focuses on continuous attributes handling for mining data stream with concept drift. Data stream is an incremental, online and real time model. Domingos and Hulten have presented a one-pass algorithm. Their system VFDT use Hoeffding inequality to achieve a probabilistic bound on the accuracy of the tree constructed. VFDTpsilas extended version CVFDT handles concept drift efficiently. In this paper, we revisit this problem and implemented a system HashCVFDT on top of CVFDT. It is as fast as hash table when inserting, seeking or deleting attribute value, and it also can sort the attribute value.
Keywords :
data mining; decision trees; file organisation; pattern classification; Hoeffding inequality; continuous attributes handling; data streams mining; decision tree classification; extended hash table; Classification tree analysis; Data mining; Data processing; Decision trees; Fuzzy systems; Memory management; Sorting; Statistics; Testing; Time factors; Continuous Attribute; Data Streams; Extended Hash Table;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
Conference_Location :
Jinan Shandong
Print_ISBN :
978-0-7695-3305-6
Type :
conf
DOI :
10.1109/FSKD.2008.481
Filename :
4666543
Link To Document :
بازگشت