• DocumentCode
    3570950
  • Title

    RLS-A reduced labeled samples approach for streaming imbalanced data with concept drift

  • Author

    Arabmakki, Elaheh ; Kantardzic, Mehmed ; Sethi, Tegjyot Singh

  • Author_Institution
    Dept. of Comput. Eng. & Comput. Sci., Univ. of Louisville, Louisville, KY, USA
  • fYear
    2014
  • Firstpage
    779
  • Lastpage
    786
  • Abstract
    In the streaming data milieu, the input data distribution is not static and the models generated must be updated when concept drift occurs, to maintain the classification performance. Updating a model requires retraining with the new incoming labeled samples. However, labeling data is a costly and time-consuming process and designing algorithms which do not require all the instances in the stream to be labeled, is needed. In this paper, a new Reduced Labeled Samples (RLS) framework is proposed, which can handle concept drift in imbalanced data streams, by selectively labeling only those set of samples which are the most useful in characterizing the drift, and thereby generating an updated model with fewer labeled samples. Experimental comparison with state of the art imbalanced stream classification algorithms shows that the RLS framework achieves comparable or better performance with requiring only 18% of the samples to be labeled.
  • Keywords
    data handling; pattern classification; RLS; concept drift; imbalanced data streaming; imbalanced stream classification algorithms; input data distribution; reduced labeled samples approach; streaming data milieu; Buildings; Classification algorithms; Data models; Electronic mail; Labeling; Sea measurements; Support vector machines; Concept drift; Partial labeling; RLS framework; Sampling; Streaming data; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Reuse and Integration (IRI), 2014 IEEE 15th International Conference on
  • Type

    conf

  • DOI
    10.1109/IRI.2014.7051968
  • Filename
    7051968