• DocumentCode
    1221355
  • Title

    A graphical model for audiovisual object tracking

  • Author

    Beal, Matthew J. ; Jojic, Nebojsa ; Attias, Hagai

  • Author_Institution
    Dept. of Comput. Sci., Toronto Univ., Ont., Canada
  • Volume
    25
  • Issue
    7
  • fYear
    2003
  • fDate
    7/1/2003 12:00:00 AM
  • Firstpage
    828
  • Lastpage
    836
  • Abstract
    We present a new approach to modeling and processing multimedia data. This approach is based on graphical models that combine audio and video variables. We demonstrate it by developing a new algorithm for tracking a moving object in a cluttered, noisy scene using two microphones and a camera. Our model uses unobserved variables to describe the data in terms of the process that generates them. It is therefore able to capture and exploit the statistical structure of the audio and video data separately, as well as their mutual dependencies. Model parameters are learned from data via an EM algorithm, and automatic calibration is performed as part of this procedure. Tracking is done by Bayesian inference of the object location from data. We demonstrate successful performance on multimedia clips captured in real world scenarios using off-the-shelf equipment.
  • Keywords
    audio-visual systems; belief networks; calibration; computer graphics; multimedia systems; pattern recognition; probability; Bayesian inference; EM algorithm; audio data; audiovisual object tracking; automatic calibration; automatic calibrations; expectation-maximization algorithm; graphical model; multimedia data; video data; Background noise; Bayesian methods; Calibration; Cameras; Computer Society; Delay effects; Graphical models; Inference algorithms; Microphone arrays; Speech enhancement;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2003.1206512
  • Filename
    1206512