DocumentCode
3707942
Title
Audiovisual voice activity detection using off-the-shelf cameras
Author
S. Montazzolli;C. R. Jung;Dan Gelb
Author_Institution
Institute of Informatics, Federal University of Rio Grande do Sul
fYear
2015
Firstpage
3886
Lastpage
3890
Abstract
This paper presents a new audiovisual voice activity detection (VAD) method for off-the-shelf cameras presenting a color sensor and two microphones. The motion of particles in the mouth region of each face detected by the camera is used as video cue, while the Generalized Cross Correlation with the PHase Transform (GCC-PHAT) is used as audio cue. We then estimate the distribution of the audiovisual cues and perform the final VAD result for each detected face using a Hidden Markov Model (HMM). Experimental results indicated that our method achieves an average 87% accuracy for a set of test videos.
Keywords
"Face","Cameras","Mouth","Speech","Microphones","Hidden Markov models","Feature extraction"
Publisher
ieee
Conference_Titel
Image Processing (ICIP), 2015 IEEE International Conference on
Type
conf
DOI
10.1109/ICIP.2015.7351533
Filename
7351533
Link To Document