DocumentCode
2800945
Title
Model-based dereverberation in the logmelspec domain for robust distant-talking speech recognition
Author
Sehr, Armin ; Maas, Roland ; Kellermann, Walter
Author_Institution
Multimedia Commun. & Signal Process., Univ. of Erlangen-Nuremberg, Erlangen, Germany
fYear
2010
fDate
14-19 March 2010
Firstpage
4298
Lastpage
4301
Abstract
The REMOS (REverberation MOdeling for Speech recognition) concept for reverberation-robust distant-talking speech recognition, introduced in for melspectral features, is extended in this contribution to logarithmic melspectral (logmelspec) features. Based on a combined acoustic model consisting of a hidden Markov model network and a reverberation model, REMOS determines clean-speech and reverberation estimates during recognition by an inner optimization operation. A reformulation of this inner optimization problem for logmelspec features, allowing an efficient solution by nonlinear optimization algorithms, is derived in this paper so that an efficient implementation of REMOS for logmelspec features becomes possible. Connected digit recognition experiments show that the proposed REMOS implementation significantly outperforms reverberantly-trained HMMs in highly reverberant environments.
Keywords
estimation theory; hidden Markov models; nonlinear programming; reverberation; spectral analysis; speech recognition; HMM; REMOS; acoustic model; clean-speech; digit recognition experiments; hidden Markov model network; inner optimization operation; logarithmic melspectral features; logmelspec domain; logmelspec features; model-based dereverberation; nonlinear optimization algorithms; reverberant environments; reverberation estimates; reverberation modeling for speech recognition; reverberation-robust distant-talking speech recognition; Automatic speech recognition; Dispersion; Hidden Markov models; Loudspeakers; Microphones; Optimization methods; Reverberation; Robustness; Speech recognition; Viterbi algorithm; Reverberation; acoustic modeling; distant-talking ASR; model-based dereverberation; robust ASR;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location
Dallas, TX
ISSN
1520-6149
Print_ISBN
978-1-4244-4295-9
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2010.5495671
Filename
5495671
Link To Document