Disagreement-Based Co-training

Author

Tanha, Jafar ; van Someren, Maarten ; Afsarmanesh, Hamideh

Author_Institution

Inf. Inst., Univ. of Amsterdam, Amsterdam, Netherlands

fYear

2011

fDate

7-9 Nov. 2011

Firstpage

803

Lastpage

810

Abstract

Recently, Semi-Supervised learning algorithms such as co-training are used in many domains. In co-training, two classifiers based on different subsets of the features or on different learning algorithms are trained in parallel and unlabeled data that are classified differently by the classifiers but for which one classifier has large confidence are labeled and used as training data for the other. In this paper, a new form of co-training, called Ensemble-Co-Training, is proposed that uses an ensemble of different learning algorithms. Based on a theorem by Angluin and Laird that relates noise in the data to the error of hypotheses learned from these data, we propose a criterion for finding a subset of high-confidence predictions and error rate for a classifier in each iteration of the training process. Experiments show that the new method in almost all domains gives better results than the state-of-the-art methods.

Keywords

learning (artificial intelligence); pattern classification; classifiers; disagreement-based co-training; ensemble-co-training; semi-supervised learning; Boosting; Decision trees; Error analysis; Labeling; Prediction algorithms; Training; Training data; Co-training; Disagreement learning; Ensemble Learning; Self-training; Semi-Supervised Learning (SSL);

fLanguage

English

Publisher

ieee

Conference_Titel

Tools with Artificial Intelligence (ICTAI), 2011 23rd IEEE International Conference on

Conference_Location

Boca Raton, FL

ISSN

1082-3409

Print_ISBN

978-1-4577-2068-0

Electronic_ISBN

1082-3409

Type

conf

DOI

10.1109/ICTAI.2011.126

Filename

6103417