DocumentCode :
671566
Title :
Active learning in nonstationary environments
Author :
Capo, Robert ; Dyer, Karl B. ; Polikar, Robi
Author_Institution :
Electr. & Comput. Eng. Dept., Rowan Univ., Glassboro, NJ, USA
fYear :
2013
fDate :
4-9 Aug. 2013
Firstpage :
1
Lastpage :
8
Abstract :
Increasing number of practical applications that involve streaming nonstationary data have led to a recent surge in algorithms designed to learn from such data. One challenging version of this problem that has not received as much attention, however, is learning streaming nonstationary data when a small initial set of data are labeled, with unlabeled data being available thereafter. We have recently introduced the COMPOSE algorithm for learning in such scenarios, which we refer to as initially labeled nonstationary streaming data. COMPOSE works remarkably well, however it requires limited (gradual) drift, and cannot address special cases such as introduction of a new class or significant overlap of existing classes, as such scenarios cannot be learned without additional labeled data. Scenarios that provide occasional or periodic limited labeled data are not uncommon, however, for which many of COMPOSE´s restrictions can be lifted. In this contribution, we describe a new version of COMPOSE as a proof-of-concept algorithm that can identify the instances whose labels - if available - would be most beneficial, and then combine those instances with unlabeled data to actively learn from streaming nonstationary data, even when the distribution of the data experiences abrupt changes. On two carefully designed experiments that include abrupt changes as well as addition of new classes, we show that COMPOSE.AL significantly outperforms original COMPOSE, while closely tracking the optimal Bayes classifier performance.
Keywords :
Bayes methods; data analysis; learning (artificial intelligence); pattern classification; COMPOSE algorithm; active learning; compacted object sample extraction; data distribution; nonstationary data streaming; nonstationary environments; nonstationary streaming data; occasional limited labeled data; optimal Bayes classifier performance; periodic limited labeled data; proof-of-concept algorithm; unlabeled data; Algorithm design and analysis; Classification algorithms; Complexity theory; Data mining; Measurement; Training data; Uncertainty; COMPOSE; active learning; concept drift; non-stationary environment; streaming data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks (IJCNN), The 2013 International Joint Conference on
Conference_Location :
Dallas, TX
ISSN :
2161-4393
Print_ISBN :
978-1-4673-6128-6
Type :
conf
DOI :
10.1109/IJCNN.2013.6706906
Filename :
6706906
Link To Document :
بازگشت