• DocumentCode
    2449710
  • Title

    Robust automatic video-conferencing with multiple cameras and microphones

  • Author

    Wang, Ce ; Griebel, Scott ; Brandstein, Michael

  • Author_Institution
    Div. of Eng. & Appl. Sci., Harvard Univ., Cambridge, MA, USA
  • Volume
    3
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    1585
  • Abstract
    An automatic video-conferencing system is proposed which employs acoustic source localization, video face tracking and pose estimation, and multi-channel speech enhancement. The video portion of the system tracks talkers by utilizing source motion, contour geometry, color data and simple facial features. Decisions involving which camera to use are based on an estimate of the head´s gazing angle. This head pose estimation is achieved using a very general head model which employs hairline features and a learned network classification procedure. Finally, a wavelet microphone array technique is used to create an enhanced speech waveform to accompany the recorded video signal. The system presented in this paper is robust to both visual clutter (e.g. ovals in the scene of interest which are not faces) and audible noise (e.g. reverberations and background noise)
  • Keywords
    acoustic noise; acoustic radiators; audio-visual systems; clutter; face recognition; image classification; learning (artificial intelligence); microphones; optical tracking; speech enhancement; teleconferencing; video cameras; wavelet transforms; acoustic source localization; audible noise; background noise; cameras; color data; contour geometry; enhanced speech waveform; facial features; general head model; hairline features; head gazing angle; head pose estimation; learned network classification procedure; multi-channel speech enhancement; ovals; recorded video signal; reverberations; robust automatic video-conferencing; source motion; talker tracking; video face tracking; visual clutter; wavelet microphone array technique; Acoustic noise; Cameras; Facial features; Geometry; Layout; Magnetic heads; Microphone arrays; Noise robustness; Speech enhancement; Tracking;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia and Expo, 2000. ICME 2000. 2000 IEEE International Conference on
  • Conference_Location
    New York, NY
  • Print_ISBN
    0-7803-6536-4
  • Type

    conf

  • DOI
    10.1109/ICME.2000.871072
  • Filename
    871072