• DocumentCode
    567672
  • Title

    Purifying training data to improve performance of multi-label classification algorithms

  • Author

    Kanj, Sawsan ; Abdallah, Fahed ; Denoux, Thierry

  • Author_Institution
    HEUDIASYC, Univ. de Technol. de Compiegne, Compiègne, France
  • fYear
    2012
  • fDate
    9-12 July 2012
  • Firstpage
    1784
  • Lastpage
    1791
  • Abstract
    Multi-label classification assumes that each object in the training set is associated with a set of labels, and the goal is to assign labels to unseen instances. k-nearest neighbors based algorithms answer the multi-label problem by using inherent information given by the neighbors of the observation to classify. Due to several problems, like errors in the input vectors, or in their labels, this information may be wrong and might lead the multi-label algorithm to fail. In this paper, we propose a simple algorithm for editing out some training instances by voting of some metrics in order to purify the existing training sample. This purifying approach is adapted on the recently proposed evidential k-nearest neighbors for multi-label classification. Comparative experimental results on various data sets demonstrate the usefulness and effectiveness of our approach.
  • Keywords
    data analysis; learning (artificial intelligence); pattern classification; evidential k-nearest neighbors; input vectors; label assignment; machine learning; multilabel classification algorithm; training data purification; training instance; Classification algorithms; Loss measurement; Noise measurement; Training; Training data; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Fusion (FUSION), 2012 15th International Conference on
  • Conference_Location
    Singapore
  • Print_ISBN
    978-1-4673-0417-7
  • Electronic_ISBN
    978-0-9824438-4-2
  • Type

    conf

  • Filename
    6290519