• DocumentCode
    3707942
  • Title

    Audiovisual voice activity detection using off-the-shelf cameras

  • Author

    S. Montazzolli;C. R. Jung;Dan Gelb

  • Author_Institution
    Institute of Informatics, Federal University of Rio Grande do Sul
  • fYear
    2015
  • Firstpage
    3886
  • Lastpage
    3890
  • Abstract
    This paper presents a new audiovisual voice activity detection (VAD) method for off-the-shelf cameras presenting a color sensor and two microphones. The motion of particles in the mouth region of each face detected by the camera is used as video cue, while the Generalized Cross Correlation with the PHase Transform (GCC-PHAT) is used as audio cue. We then estimate the distribution of the audiovisual cues and perform the final VAD result for each detected face using a Hidden Markov Model (HMM). Experimental results indicated that our method achieves an average 87% accuracy for a set of test videos.
  • Keywords
    "Face","Cameras","Mouth","Speech","Microphones","Hidden Markov models","Feature extraction"
  • Publisher
    ieee
  • Conference_Titel
    Image Processing (ICIP), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/ICIP.2015.7351533
  • Filename
    7351533