DocumentCode :
1893042
Title :
Use of Hoeffding trees in concept based data stream mining
Author :
Hoeglinger, Stefan ; Pears, Russel
Author_Institution :
Sch. of Comput. & Math. Sci., Auckland Univ. of Technol., Auckland
fYear :
2007
fDate :
4-6 Dec. 2007
Firstpage :
57
Lastpage :
62
Abstract :
Recent research in data mining has focussed on developing new algorithms for mining high-speed data streams. Most real-world data streams have in common that the underlying data generation mechanism changes over time, introducing so-called concept drift into the data. Many current algorithms incorporate a time-based window to be able to cope with drift in order to keep their model up-to-date with the data stream. A major problem with this approach is the potential loss of valuable information as data slides out of the time window. This is particularly a concern in those environments where patterns recur. In this paper, we present a concept-based window approach, which is integrated with a high-speed decision tree learner. Our approach uses the content of the data stream itself in order to decide which information is to be erased. Several methodologies, all based around minimising the overall information loss when pruning the decision tree, are discussed.
Keywords :
data mining; decision trees; learning (artificial intelligence); Hoeffding tree; concept based data stream mining; concept-based window approach; data generation mechanism; high-speed decision tree learner; time-based window approach; Adaptive algorithm; Aging; Credit cards; Data mining; Decision trees; Learning systems; Monitoring; Probability distribution; Runtime environment; data stream mining; decsion trees; hoeffding bound; information gain;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information and Automation for Sustainability, 2007. ICIAFS 2007. Third International Conference on
Conference_Location :
Melbourne, VIC
Print_ISBN :
978-1-4244-1899-2
Electronic_ISBN :
978-1-4244-1900-5
Type :
conf
DOI :
10.1109/ICIAFS.2007.4544780
Filename :
4544780
Link To Document :
بازگشت