DocumentCode :
3707942
Title :
Audiovisual voice activity detection using off-the-shelf cameras
Author :
S. Montazzolli;C. R. Jung;Dan Gelb
Author_Institution :
Institute of Informatics, Federal University of Rio Grande do Sul
fYear :
2015
Firstpage :
3886
Lastpage :
3890
Abstract :
This paper presents a new audiovisual voice activity detection (VAD) method for off-the-shelf cameras presenting a color sensor and two microphones. The motion of particles in the mouth region of each face detected by the camera is used as video cue, while the Generalized Cross Correlation with the PHase Transform (GCC-PHAT) is used as audio cue. We then estimate the distribution of the audiovisual cues and perform the final VAD result for each detected face using a Hidden Markov Model (HMM). Experimental results indicated that our method achieves an average 87% accuracy for a set of test videos.
Keywords :
"Face","Cameras","Mouth","Speech","Microphones","Hidden Markov models","Feature extraction"
Publisher :
ieee
Conference_Titel :
Image Processing (ICIP), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/ICIP.2015.7351533
Filename :
7351533
Link To Document :
بازگشت