DocumentCode
698825
Title
A multimodal approach to extract optimized audio features for speaker detection
Author
Besson, Patricia ; Kunt, Murat ; Butz, Torsten ; Thiran, Jean-Philippe
Author_Institution
Signal Process. Inst. (ITS), Ecole Polytech. Fed. de Lausanne (EPFL), Lausanne, Switzerland
fYear
2005
fDate
4-8 Sept. 2005
Firstpage
1
Lastpage
4
Abstract
We present a method that exploits the information theoretic framework described in [1] to extract optimal audio features with respect to the video features. A simple measure of mutual information between the resulting audio features and the video ones allows to detect the active speaker among different candidates. The results show that our method is able to exploit the shared speech information contained in audio and video signals to recover their common source.
Keywords
audio signal processing; feature extraction; image recognition; speaker recognition; video signal processing; active speaker detection; information theoretic framework; multimodal approach; optimized audio feature extraction; shared speech information; speaker detection; video features; video signals; Data mining; Feature extraction; Markov processes; Mouth; Mutual information; Optimization; Speech;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing Conference, 2005 13th European
Conference_Location
Antalya
Print_ISBN
978-160-4238-21-1
Type
conf
Filename
7078419
Link To Document