DocumentCode :
2952990
Title :
TV Commercial Classification by using Multi-Modal Textual Information
Author :
Zheng, Yantao ; Duan, Lingyu ; Tian, Qi ; Jin, Jesse S.
Author_Institution :
NUS Graduate Sch., Nat. Univ. of Singapore
fYear :
2006
fDate :
9-12 July 2006
Firstpage :
497
Lastpage :
500
Abstract :
In this paper, we propose an approach for TV commercial video classification by the categories of advertised products or services (e.g. automobiles, healthcare products, etc). Since automatic speech recognition (ASR) and optical character recognition (OCR) can deliver meaningful textual information related to products or services, TV commercial video classification is formulated as the problem of text categorization. However, there exist two challenges. Firstly, the background music of TV commercials makes ASR techniques yield erroneous and deficient output transcripts. Secondly, even if ASR and OCR could work perfectly, the limited textual information from TV commercials do not suffice to train a generic and non-overfitting text categorizer. For the first issue, our approach resorts to the external resources to expand deficient ASR and OCR transcripts. The output transcripts of ASR and OCR are parsed to yield a few keywords, on which a Web searching is executed to retrieve relevant and semantically informative articles from World Wide Web (WWW). The retrieved articles are then utilized to construct textual feature vectors and perform text categorization on behalf of commercials. For the second issue, a topic-wise document corpus is constructed from the public corpora like Reuters-21578 or from the articles manually collected from WWW for the training of text categorizers. Experimental results have shown that the proposed approach alleviates the negative effects from weak ASR/OCR performance and yield a promising classification accuracy of 80.9%
Keywords :
image classification; optical character recognition; semantic Web; speech recognition; television broadcasting; ASR; OCR; TV commercial video classification; Web searching; automatic speech recognition; background music; feature vector; multimodal textual information; optical character recognition; semantically informative article extraction; Australia; Automatic speech recognition; Character recognition; Image segmentation; Information technology; Optical character recognition software; Speech analysis; TV; Text categorization; World Wide Web;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia and Expo, 2006 IEEE International Conference on
Conference_Location :
Toronto, Ont.
Print_ISBN :
1-4244-0366-7
Electronic_ISBN :
1-4244-0367-7
Type :
conf
DOI :
10.1109/ICME.2006.262434
Filename :
4036645
Link To Document :
بازگشت