DocumentCode :
3404210
Title :
Caption-aided speech detection in videos
Author :
Li, Cong ; Ou, Zhijian ; Hu, Wei ; Wang, Tao ; Zhang, Yimin
Author_Institution :
Dept. of Electron. Eng., Tsinghua Univ., Beijing
fYear :
2008
fDate :
March 31 2008-April 4 2008
Firstpage :
141
Lastpage :
144
Abstract :
This paper presents a novel audio-visual fusion method for speech detection, which is an important front-end for content-based video processing. This approach aims to extract homogeneous speech segments from the accompanying audio stream in real-world movie/TV videos with the help of video captions. Note that captions are mainly created to help viewers to follow the dialog, rather than to accurately locate the speech regions. We propose a caption-aided speech detection approach, which makes use of both caption information and audio information. The inaccurate positions of the captions are refined through using audio features (pitch and MFCCs) and BIC-based acoustic change detection. Comparison experiments against several other traditional speech detection approaches are conducted, showing that the proposed approach improves the speech detection performance greatly.
Keywords :
Bayes methods; audio signal processing; feature extraction; signal detection; speech processing; video signal processing; BIC-based acoustic change detection; Bayesian information criterion; audio feature; audio stream; audio-visual fusion method; caption-aided speech detection; content-based video processing; real-world movie/TV video; speech segment extraction; Acoustic signal detection; Bayesian methods; Data mining; Detection algorithms; Indexing; Motion pictures; Speech processing; Streaming media; TV; Videos; Bayesian information criterion (BIC); caption detection; pitch; speech detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
ISSN :
1520-6149
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2008.4517566
Filename :
4517566
Link To Document :
بازگشت