• DocumentCode
    698825
  • Title

    A multimodal approach to extract optimized audio features for speaker detection

  • Author

    Besson, Patricia ; Kunt, Murat ; Butz, Torsten ; Thiran, Jean-Philippe

  • Author_Institution
    Signal Process. Inst. (ITS), Ecole Polytech. Fed. de Lausanne (EPFL), Lausanne, Switzerland
  • fYear
    2005
  • fDate
    4-8 Sept. 2005
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    We present a method that exploits the information theoretic framework described in [1] to extract optimal audio features with respect to the video features. A simple measure of mutual information between the resulting audio features and the video ones allows to detect the active speaker among different candidates. The results show that our method is able to exploit the shared speech information contained in audio and video signals to recover their common source.
  • Keywords
    audio signal processing; feature extraction; image recognition; speaker recognition; video signal processing; active speaker detection; information theoretic framework; multimodal approach; optimized audio feature extraction; shared speech information; speaker detection; video features; video signals; Data mining; Feature extraction; Markov processes; Mouth; Mutual information; Optimization; Speech;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Conference, 2005 13th European
  • Conference_Location
    Antalya
  • Print_ISBN
    978-160-4238-21-1
  • Type

    conf

  • Filename
    7078419