Title :
A multimodal approach to extract optimized audio features for speaker detection
Author :
Besson, Patricia ; Kunt, Murat ; Butz, Torsten ; Thiran, Jean-Philippe
Author_Institution :
Signal Process. Inst. (ITS), Ecole Polytech. Fed. de Lausanne (EPFL), Lausanne, Switzerland
Abstract :
We present a method that exploits the information theoretic framework described in [1] to extract optimal audio features with respect to the video features. A simple measure of mutual information between the resulting audio features and the video ones allows to detect the active speaker among different candidates. The results show that our method is able to exploit the shared speech information contained in audio and video signals to recover their common source.
Keywords :
audio signal processing; feature extraction; image recognition; speaker recognition; video signal processing; active speaker detection; information theoretic framework; multimodal approach; optimized audio feature extraction; shared speech information; speaker detection; video features; video signals; Data mining; Feature extraction; Markov processes; Mouth; Mutual information; Optimization; Speech;
Conference_Titel :
Signal Processing Conference, 2005 13th European
Conference_Location :
Antalya
Print_ISBN :
978-160-4238-21-1