DocumentCode :
3570950
Title :
RLS-A reduced labeled samples approach for streaming imbalanced data with concept drift
Author :
Arabmakki, Elaheh ; Kantardzic, Mehmed ; Sethi, Tegjyot Singh
Author_Institution :
Dept. of Comput. Eng. & Comput. Sci., Univ. of Louisville, Louisville, KY, USA
fYear :
2014
Firstpage :
779
Lastpage :
786
Abstract :
In the streaming data milieu, the input data distribution is not static and the models generated must be updated when concept drift occurs, to maintain the classification performance. Updating a model requires retraining with the new incoming labeled samples. However, labeling data is a costly and time-consuming process and designing algorithms which do not require all the instances in the stream to be labeled, is needed. In this paper, a new Reduced Labeled Samples (RLS) framework is proposed, which can handle concept drift in imbalanced data streams, by selectively labeling only those set of samples which are the most useful in characterizing the drift, and thereby generating an updated model with fewer labeled samples. Experimental comparison with state of the art imbalanced stream classification algorithms shows that the RLS framework achieves comparable or better performance with requiring only 18% of the samples to be labeled.
Keywords :
data handling; pattern classification; RLS; concept drift; imbalanced data streaming; imbalanced stream classification algorithms; input data distribution; reduced labeled samples approach; streaming data milieu; Buildings; Classification algorithms; Data models; Electronic mail; Labeling; Sea measurements; Support vector machines; Concept drift; Partial labeling; RLS framework; Sampling; Streaming data; Support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Reuse and Integration (IRI), 2014 IEEE 15th International Conference on
Type :
conf
DOI :
10.1109/IRI.2014.7051968
Filename :
7051968
Link To Document :
بازگشت