• DocumentCode
    2559956
  • Title

    Auditory and Visual Integration based Localization and Tracking of Multiple Moving Sounds in Daily-life Environments

  • Author

    Hyun-Don Kim ; Komatani, K. ; Ogata, Takaaki ; Okuno, Hiroshi G.

  • Author_Institution
    Speech Media Process. Group, Kyoto Univ., Kyoto, Japan
  • fYear
    2007
  • fDate
    26-29 Aug. 2007
  • Firstpage
    399
  • Lastpage
    404
  • Abstract
    This paper presents techniques that enable talker tracking for effective human-robot interaction. To track moving people in daily-life environments, localizing multiple moving sounds is necessary so that robots can locate talkers. However, the conventional method requires an array of microphones and impulse response data. Therefore, we propose a way to integrate a cross-power spectrum phase analysis (CSP) method and an expectation-maximization (EM) algorithm. The CSP can localize sound sources using only two microphones and does not need impulse response data. Moreover, the EM algorithm increases the system´s effectiveness and allows it to cope with multiple sound sources. We confirmed that the proposed method performs better than the conventional method. In addition, we added a particle filter to the tracking process to produce a reliable tracking path and the particle filter is able to integrate audio-visual information effectively. Furthermore, the applied particle filter is able to track people while dealing with various noises that are even loud sounds in the daily-life environments.
  • Keywords
    audio signal processing; expectation-maximisation algorithm; man-machine systems; particle filtering (numerical methods); robot vision; spectral analysis; auditory based localization; cross-power spectrum phase analysis; daily-life environment; expectation-maximization algorithm; human-robot interaction; impulse response data; microphone array; multiple moving sound tracking; particle filter; visual based localization; Acoustic noise; Algorithm design and analysis; Human robot interaction; Intelligent robots; Microphone arrays; Particle filters; Particle tracking; Streaming media; Surgery; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Robot and Human interactive Communication, 2007. RO-MAN 2007. The 16th IEEE International Symposium on
  • Conference_Location
    Jeju
  • Print_ISBN
    978-1-4244-1634-9
  • Type

    conf

  • DOI
    10.1109/ROMAN.2007.4415117
  • Filename
    4415117