DocumentCode
3570950
Title
RLS-A reduced labeled samples approach for streaming imbalanced data with concept drift
Author
Arabmakki, Elaheh ; Kantardzic, Mehmed ; Sethi, Tegjyot Singh
Author_Institution
Dept. of Comput. Eng. & Comput. Sci., Univ. of Louisville, Louisville, KY, USA
fYear
2014
Firstpage
779
Lastpage
786
Abstract
In the streaming data milieu, the input data distribution is not static and the models generated must be updated when concept drift occurs, to maintain the classification performance. Updating a model requires retraining with the new incoming labeled samples. However, labeling data is a costly and time-consuming process and designing algorithms which do not require all the instances in the stream to be labeled, is needed. In this paper, a new Reduced Labeled Samples (RLS) framework is proposed, which can handle concept drift in imbalanced data streams, by selectively labeling only those set of samples which are the most useful in characterizing the drift, and thereby generating an updated model with fewer labeled samples. Experimental comparison with state of the art imbalanced stream classification algorithms shows that the RLS framework achieves comparable or better performance with requiring only 18% of the samples to be labeled.
Keywords
data handling; pattern classification; RLS; concept drift; imbalanced data streaming; imbalanced stream classification algorithms; input data distribution; reduced labeled samples approach; streaming data milieu; Buildings; Classification algorithms; Data models; Electronic mail; Labeling; Sea measurements; Support vector machines; Concept drift; Partial labeling; RLS framework; Sampling; Streaming data; Support vector machines;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Reuse and Integration (IRI), 2014 IEEE 15th International Conference on
Type
conf
DOI
10.1109/IRI.2014.7051968
Filename
7051968
Link To Document