DocumentCode
177989
Title
A newem estimationof dynamic stream weights for coupled-HMM-based audio-visual ASR
Author
Abdelaziz, Ahmed Hussen ; Zeiler, Steffen ; Kolossa, Dorothea
Author_Institution
Digital Signal Process. Group, Ruhr-Univ. Bochum, Bochum, Germany
fYear
2014
fDate
4-9 May 2014
Firstpage
1527
Lastpage
1531
Abstract
Mutually deploying visual and acoustical information in automatic speech recognition systems increases their robustness against acoustical environmental effects like additive noise and reverberation. Optimal fusion of the audio and video streams requires dynamic adaptation of the relative contribution of each modality. This can be achieved by weighting each stream according to its reliability by an appropriate stream weight. In this paper we propose a new expectation maximization algorithm that estimates oracle frame-dependent stream weights for coupled-HMM-based audio-visual speech recognition. Moreover, we introduce a greedy optimization approach that reasonably initializes this algorithm. The proposed approach is evaluated on the Grid audio-visual database and results in an average relative word error rate reduction of 38% and 58% compared to grid search and Bayes fusion, respectively. The estimated oracle stream weights can be used instead of the conventional global fixed stream weights to improve the supervised training of stream weight estimators.
Keywords
expectation-maximisation algorithm; optimisation; speech recognition; EM estimation; audio streams; automatic speech recognition systems; coupled HMM based audio visual ASR; dynamic stream weights; expectation maximization algorithm; greedy optimization approach; optimal fusion; oracle frame dependent stream weights; video streams; Acoustics; Conferences; Decision support systems; Speech; Speech processing; AVASR; CHMM; Stream weight;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location
Florence
Type
conf
DOI
10.1109/ICASSP.2014.6853853
Filename
6853853
Link To Document