Title :
Integer Programming for Multi-class Active Learning
Author :
Yankov, Dragomir ; Rajan, Suju ; Ratnaparkhi, Adwait
Author_Institution :
Yahoo! Labs., Sunnyvale, CA, USA
Abstract :
Active learning has been demonstrated to be a powerful tool for improving the effectiveness of binary classifiers. It iteratively identifies informative unlabeled examples which after labeling are used to augment the initial training set. Adapting the procedure to large-scale, multi-class classification problems, however, poses certain challenges. For instance, to guarantee improvement by the method we may need to select a large number of examples that require prohibitive labeling resources. Furthermore, the notion of informative examples also changes significantly when multiple classes are considered. In this paper we show that multi-class active learning can be cast into an integer programming framework, where a subset of examples that are informative across maximum number of classes is selected. We test our approach on several large-scale document categorization problems. We demonstrate that in the case of limited labeling resources and large number of classes the proposed method is more effective compared to other known approaches.
Keywords :
document handling; integer programming; iterative methods; learning (artificial intelligence); pattern classification; binary classifier; document categorization; informative unlabeled example; integer programming; multiclass active learning; multiclass classification; prohibitive labeling resource; active learning; integer programming; multi-class classification;
Conference_Titel :
Data Mining Workshops (ICDMW), 2010 IEEE International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-1-4244-9244-2
Electronic_ISBN :
978-0-7695-4257-7
DOI :
10.1109/ICDMW.2010.148