Title :
A robust transcription system for soccer video database
Author :
Pham, Nhut M. ; Duong, Duc A. ; Vu, Quan H.
Author_Institution :
Univ. of Sci., Ho Chi Minh City, Vietnam
Abstract :
This paper presents a robust approach for the transcription of soccer video database. By exploiting audio channels in the video, spoken information is transcribed using a canonical speech recognition system. Since soccer videos vary in both speech quality and content, the transcription system is posed with three main problems: noisy data, foreign term interferences, and emotional variations in speech prosody. Three solutions are proposed to each of the problems respectively: a noise reduction scheme, a cross-lingual transliteration model, and an advanced acoustic modeling technique. Experimental evaluations of the proposed methods are conducted on the Vietnamese AFF Suzuki-cup database consisting of over 14-hour video. In the best case, system performance reaches 83.3% accuracy rate.
Keywords :
speech recognition; video databases; video retrieval; Vietnamese AFF Suzuki-cup database; audio channels; canonical speech recognition system; cross-lingual transliteration model; emotional variations; foreign term interferences; noise reduction scheme; noisy data; robust transcription system; soccer video database; speech prosody; transcription system; Adaptation model; Noise measurement; Noise reduction; Speech; Speech enhancement; Speech recognition;
Conference_Titel :
Audio Language and Image Processing (ICALIP), 2010 International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-5856-1
DOI :
10.1109/ICALIP.2010.5685108