Title :
Active Learning from Data Streams
Author :
Zhu, Xingquan ; Zhang, Peng ; Lin, Xiaodong ; Shi, Yong
Author_Institution :
Florida Atlantic Univ., Boca Raton
Abstract :
In this paper, we address a new research problem on active learning from data streams where data volumes grow continuously and labeling all data is considered expensive and impractical. The objective is to label a small portion of stream data from which a model is derived to predict newly arrived instances as accurate as possible. In order to tackle the challenges raised by data streams´ dynamic nature, we propose a classifier ensembling based active learning framework which selectively labels instances from data streams to build an accurate classifier. A minimal variance principle is introduced to guide instance labeling from data streams. In addition, a weight updating rule is derived to ensure that our instance labeling process can adaptively adjust to dynamic drifting concepts in the data. Experimental results on synthetic and real-world data demonstrate the performances of the proposed efforts in comparison with other simple approaches.
Keywords :
data handling; pattern classification; classifier ensembling based active learning framework; data streams; instance labeling; minimal variance principle; Accuracy; Association rules; Computer science; Data engineering; Data mining; Decision making; Labeling; Predictive models; USA Councils; Uncertainty;
Conference_Titel :
Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
Conference_Location :
Omaha, NE
Print_ISBN :
978-0-7695-3018-5
DOI :
10.1109/ICDM.2007.101