Title :
Co-training by Committee: A New Semi-supervised Learning Framework
Author :
Hady, M. ; Schwenker, Friedhelm
Author_Institution :
Inst. of Neural Inf. Process., Univ. of Ulm, Ulm
Abstract :
For many data mining applications, it is necessary to develop algorithms that use unlabeled data to improve the accuracy of the supervised learning. Co-Training is a popular semi-supervised learning algorithm. It assumes that each example is represented by two or more redundantly sufficient sets of features (views) and these views are independent given the class. However, these assumptions are not satisfied in many real-world application domains. Therefore, we present a framework called co-training by committee (CoBC), in which a set of diverse classifiers are used to learn each other. The framework is a simple, general single-view semi-supervised learner that can use any ensemble learner to build diverse committees. Experimental studies on CoBC using bagging, AdaBoost and the random subspace method (RSM) as ensemble learners demonstrate that error diversity among classifiers leads to an effective co-training that requires neither redundant and independent views nor different learning algorithms.
Keywords :
data mining; learning (artificial intelligence); random processes; AdaBoost; cotraining by committee; data mining applications; diverse classifiers; error diversity; random subspace method; real-world application domains; semisupervised learning framework; Bagging; Conferences; Content based retrieval; Data mining; Image retrieval; Information processing; Information retrieval; Object detection; Semisupervised learning; Supervised learning; classification; co-training; data mining; ensemble learning; learning from unlabeled data; semi-supervised learning;
Conference_Titel :
Data Mining Workshops, 2008. ICDMW '08. IEEE International Conference on
Conference_Location :
Pisa
Print_ISBN :
978-0-7695-3503-6
Electronic_ISBN :
978-0-7695-3503-6
DOI :
10.1109/ICDMW.2008.27