• DocumentCode
    2520040
  • Title

    Kalman filters for audio-video source localization

  • Author

    Gehrig, Tobias ; Nickel, Kai ; Ekenel, Hazim Kemal ; Klee, Ulrich ; McDonough, John

  • Author_Institution
    Institut fur Logik, Komplexitat und Deduktionssysteme, Karlsruhe Univ., Germany
  • fYear
    2005
  • fDate
    16-19 Oct. 2005
  • Firstpage
    118
  • Lastpage
    121
  • Abstract
    In prior work, we proposed using an extended Kalman filter to directly update position estimates in a speaker localization system based on time delays of arrival. We found that such a scheme provided superior tracking quality as compared with the conventional closed-form approximation methods. In this work, we enhance our audio localizer with video information. We propose an algorithm to incorporate detected face positions in different camera views into the Kalman filter without doing any explicit triangulation. This approach yields a robust source localizer that functions reliably both for segments wherein the speaker is silent, which would be detrimental for an audio only tracker, and wherein many faces appear, which would confuse a video only tracker. We tested our algorithm on a data set consisting of seminars held by actual speakers. Our experiments revealed that the audio-video localizer functioned better than a localizer based solely on audio or solely on video features.
  • Keywords
    Kalman filters; audio signal processing; delays; nonlinear filters; speaker recognition; time-of-arrival estimation; video signal processing; audio-video source localization; extended Kalman filter; face positions detection; source localizer; speaker localization system; time delays of arrival; tracking quality; video information; Acoustic testing; Cameras; Delay effects; Delay estimation; Face detection; Loudspeakers; Maximum likelihood estimation; Position measurement; Robustness; Seminars;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Applications of Signal Processing to Audio and Acoustics, 2005. IEEE Workshop on
  • Print_ISBN
    0-7803-9154-3
  • Type

    conf

  • DOI
    10.1109/ASPAA.2005.1540183
  • Filename
    1540183