DocumentCode :
567672
Title :
Purifying training data to improve performance of multi-label classification algorithms
Author :
Kanj, Sawsan ; Abdallah, Fahed ; Denoux, Thierry
Author_Institution :
HEUDIASYC, Univ. de Technol. de Compiegne, Compiègne, France
fYear :
2012
fDate :
9-12 July 2012
Firstpage :
1784
Lastpage :
1791
Abstract :
Multi-label classification assumes that each object in the training set is associated with a set of labels, and the goal is to assign labels to unseen instances. k-nearest neighbors based algorithms answer the multi-label problem by using inherent information given by the neighbors of the observation to classify. Due to several problems, like errors in the input vectors, or in their labels, this information may be wrong and might lead the multi-label algorithm to fail. In this paper, we propose a simple algorithm for editing out some training instances by voting of some metrics in order to purify the existing training sample. This purifying approach is adapted on the recently proposed evidential k-nearest neighbors for multi-label classification. Comparative experimental results on various data sets demonstrate the usefulness and effectiveness of our approach.
Keywords :
data analysis; learning (artificial intelligence); pattern classification; evidential k-nearest neighbors; input vectors; label assignment; machine learning; multilabel classification algorithm; training data purification; training instance; Classification algorithms; Loss measurement; Noise measurement; Training; Training data; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Fusion (FUSION), 2012 15th International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-1-4673-0417-7
Electronic_ISBN :
978-0-9824438-4-2
Type :
conf
Filename :
6290519
Link To Document :
بازگشت