DocumentCode
567672
Title
Purifying training data to improve performance of multi-label classification algorithms
Author
Kanj, Sawsan ; Abdallah, Fahed ; Denoux, Thierry
Author_Institution
HEUDIASYC, Univ. de Technol. de Compiegne, Compiègne, France
fYear
2012
fDate
9-12 July 2012
Firstpage
1784
Lastpage
1791
Abstract
Multi-label classification assumes that each object in the training set is associated with a set of labels, and the goal is to assign labels to unseen instances. k-nearest neighbors based algorithms answer the multi-label problem by using inherent information given by the neighbors of the observation to classify. Due to several problems, like errors in the input vectors, or in their labels, this information may be wrong and might lead the multi-label algorithm to fail. In this paper, we propose a simple algorithm for editing out some training instances by voting of some metrics in order to purify the existing training sample. This purifying approach is adapted on the recently proposed evidential k-nearest neighbors for multi-label classification. Comparative experimental results on various data sets demonstrate the usefulness and effectiveness of our approach.
Keywords
data analysis; learning (artificial intelligence); pattern classification; evidential k-nearest neighbors; input vectors; label assignment; machine learning; multilabel classification algorithm; training data purification; training instance; Classification algorithms; Loss measurement; Noise measurement; Training; Training data; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Fusion (FUSION), 2012 15th International Conference on
Conference_Location
Singapore
Print_ISBN
978-1-4673-0417-7
Electronic_ISBN
978-0-9824438-4-2
Type
conf
Filename
6290519
Link To Document