DocumentCode
3292169
Title
An Efficient Decision Tree Classification Method Based on Extended Hash Table for Data Streams Mining
Author
Ouyang, Zhenzheng ; Wu, Quanyuan ; Wang, Tao
Author_Institution
Zhenzheng Ouyang Sci. Sch., Nat. Univ. of Defense Technol., Changsha
Volume
5
fYear
2008
fDate
18-20 Oct. 2008
Firstpage
313
Lastpage
317
Abstract
This paper focuses on continuous attributes handling for mining data stream with concept drift. Data stream is an incremental, online and real time model. Domingos and Hulten have presented a one-pass algorithm. Their system VFDT use Hoeffding inequality to achieve a probabilistic bound on the accuracy of the tree constructed. VFDTpsilas extended version CVFDT handles concept drift efficiently. In this paper, we revisit this problem and implemented a system HashCVFDT on top of CVFDT. It is as fast as hash table when inserting, seeking or deleting attribute value, and it also can sort the attribute value.
Keywords
data mining; decision trees; file organisation; pattern classification; Hoeffding inequality; continuous attributes handling; data streams mining; decision tree classification; extended hash table; Classification tree analysis; Data mining; Data processing; Decision trees; Fuzzy systems; Memory management; Sorting; Statistics; Testing; Time factors; Continuous Attribute; Data Streams; Extended Hash Table;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
Conference_Location
Jinan Shandong
Print_ISBN
978-0-7695-3305-6
Type
conf
DOI
10.1109/FSKD.2008.481
Filename
4666543
Link To Document