DocumentCode
442647
Title
Co-training non-robust classifiers for video semantic concept detection
Author
Yan, Rong ; Naphade, Milind
Author_Institution
Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA
Volume
1
fYear
2005
fDate
11-14 Sept. 2005
Abstract
Semantic video characterization by automatic metadata tagging is increasingly popular. While some of these concepts are unimodal manifest in image or audio modalities, a large number of such concepts are multimodal manifest in both the image and the audio modalities. Further while some concepts like outdoors and face occur sufficiently in terms of frequency of occurrence in training sets, a large number are rarer to find thus making them difficult to detect during automatic annotation. Semi-supervised learning algorithms such as co-training may help by incorporating a large amount of unlabeled data, which holds the promise of allowing the redundant information across views to improve the learning performance. Unfortunately, this promise has not been realized in multimedia content analysis partly because the models built using the labeled data alone are not too robust and their noisy classification of the unlabeled data set compounds problems faced by the co-training algorithm. In this paper we analyze whether a judicious application of co-training for automatically labeling some of the unlabeled samples and reinducting them into the training set along with manual quality control can help improve the detection performance. We report our findings in the context of the TRECVID 2003 common annotation corpus.
Keywords
image classification; image sequences; learning (artificial intelligence); video signal processing; TRECVID 2003 common annotation corpus; audio modality; automatic metadata tagging; cotraining nonrobust classifiers; image modality; multimedia content analysis; quality control; semisupervised learning algorithms; video semantic concept detection; Algorithm design and analysis; Face detection; Frequency; Labeling; Performance analysis; Quality control; Robustness; Semisupervised learning; Speech; Tagging;
fLanguage
English
Publisher
ieee
Conference_Titel
Image Processing, 2005. ICIP 2005. IEEE International Conference on
Print_ISBN
0-7803-9134-9
Type
conf
DOI
10.1109/ICIP.2005.1529973
Filename
1529973
Link To Document