DocumentCode :
3125831
Title :
Enabling Fast Lazy Learning for Data Streams
Author :
Zhang, Peng ; Gao, Byron J. ; Zhu, Xingquan ; Guo, Li
Author_Institution :
Inst. of Comput. Technol., Beijing, China
fYear :
2011
fDate :
11-14 Dec. 2011
Firstpage :
932
Lastpage :
941
Abstract :
Lazy learning, such as k-nearest neighbor learning, has been widely applied to many applications. Known for well capturing data locality, lazy learning can be advantageous for highly dynamic and complex learning environments such as data streams. Yet its high memory consumption and low prediction efficiency have made it less favorable for stream oriented applications. Specifically, traditional lazy learning stores all the training data and the inductive process is deferred until a query appears, whereas in stream applications, data records flow continuously in large volumes and the prediction of class labels needs to be made in a timely manner. In this paper, we provide a systematic solution that overcomes the memory and efficiency limitations and enables fast lazy learning for concept drifting data streams. In particular, we propose a novel Lazy-tree (Ltree for short) indexing structure that dynamically maintains compact high-level summaries of historical stream records. L-trees are M-Tree [5] like, height-balanced, and can help achieve great memory consumption reduction and sub-linear time complexity for prediction. Moreover, L-trees continuously absorb new stream records and discard outdated ones, so they can naturally adapt to the dynamically changing concepts in data streams for accurate prediction. Extensive experiments on real-world and synthetic data streams demonstrate the performance of our approach.
Keywords :
computational complexity; data mining; learning (artificial intelligence); pattern classification; tree data structures; Ltree; M-tree; complex learning environments; data locality; data mining community; data stream classification; dynamic learning environments; fast lazy learning; k-nearest neighbor learning; lazy-tree; memory consumption reduction; sublinear time complexity; Complexity theory; Indexing; Learning systems; Memory management; Routing; Training data; Vectors; Spatial indexing; concept drifting; data stream classification; data stream mining; lazy learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining (ICDM), 2011 IEEE 11th International Conference on
Conference_Location :
Vancouver,BC
ISSN :
1550-4786
Print_ISBN :
978-1-4577-2075-8
Type :
conf
DOI :
10.1109/ICDM.2011.63
Filename :
6137298
Link To Document :
بازگشت