• DocumentCode
    3029484
  • Title

    Multi-modal fusion with particle filter for speaker localization and tracking

  • Author

    Heuer, Michael ; Al-Hamadi, Ayoub ; Michaelis, Bernd ; Wendemuth, Andreas

  • Author_Institution
    Inst. for Electron., Signal Process. & Commun., Otto-von-Guericke Univ. of Magdeburg, Magdeburg, Germany
  • fYear
    2011
  • fDate
    26-28 July 2011
  • Firstpage
    6450
  • Lastpage
    6453
  • Abstract
    This paper describes a methodology for fusing multimodal data meaningful together, in order to detect and track a speaker with a conventional sensor setup. We use Gaussian mixtures to combine the sensor information within a particle filter, such that a single speaker can be identified in the presence of multiple visual observations. The major advantages are design considerations that let the system perform in real time, while using an easily extensible framework. Besides, we highly reduce noise which gives us a more dependable prediction. Results illustrate the localization estimations in a two- and a three-person scenario.
  • Keywords
    particle filtering (numerical methods); signal denoising; speaker recognition; Gaussian mixtures; conventional sensor setup; localization estimations; multimodal data fusion; multimodal fusion; multiple visual observations; noise reduction; particle filter; speaker localization; speaker tracking; Computational modeling; Feature extraction; Image color analysis; Image segmentation; Particle filters; Skin; Streaming media; Computer Vision; Data Fusion; Pattern Recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia Technology (ICMT), 2011 International Conference on
  • Conference_Location
    Hangzhou
  • Print_ISBN
    978-1-61284-771-9
  • Type

    conf

  • DOI
    10.1109/ICMT.2011.6002028
  • Filename
    6002028