DocumentCode
2528356
Title
Multimodal Fusion and Learning with Uncertain Features Applied to Audiovisual Speech Recognition
Author
Papandreou, George ; Katsamanis, Athanassios ; Pitsikalis, Vassilis ; Maragos, Petros
Author_Institution
Nat. Tech. Univ. of Athens, Athens
fYear
2007
fDate
1-3 Oct. 2007
Firstpage
264
Lastpage
267
Abstract
We study the effect of uncertain feature measurements and show how classification and learning rules should be adjusted to compensate for it. Our approach is particularly fruitful in multimodal fusion scenarios, such as audio-visual speech recognition, where multiple streams of complementary features whose reliability is time-varying are integrated. For such applications, by taking the measurement noise uncertainty of each feature stream into account, the proposed framework leads to highly adaptive multimodal fusion rules for classification and learning which are widely applicable and easy to implement. We further show that previous multimodal fusion methods relying on stream weights fall under our scheme under certain assumptions; this provides novel insights into their applicability for various tasks and suggests new practical ways for estimating the stream weights adaptively. The potential of our approach is demonstrated in audio-visual speech recognition experiments.
Keywords
audio-visual systems; speech recognition; adaptive multimodal fusion rules; audio-visual speech recognition; learning rules; uncertain features learning; Automatic speech recognition; Electric variables measurement; Hidden Markov models; Humans; Measurement uncertainty; Noise measurement; Probability; Speech recognition; Streaming media; Working environment noise;
fLanguage
English
Publisher
ieee
Conference_Titel
Multimedia Signal Processing, 2007. MMSP 2007. IEEE 9th Workshop on
Conference_Location
Crete
Print_ISBN
978-1-4244-1274-7
Type
conf
DOI
10.1109/MMSP.2007.4412868
Filename
4412868
Link To Document