Title :
On separating environmental and speaker adaptation
Author :
Lima, Carlos ; Silva, Carlos ; Tavares, Adriano ; Oliveira, Jorge
Author_Institution :
Dept. of Ind. Electron., Univ. of Minho, Portugal
Abstract :
This paper presents a maximum likelihood (ML) approach, concerned to the background model estimation, in noisy acoustic non-stationary environments. The external noise source is characterised by a time constant convolutional and a time varying additive components. The HMM composition technique, provides a mechanism for integrating parametric models of acoustic background with the signal model, so that noise compensation is tightly coupled with the background model estimation. However, the existing continuous adaptation algorithms usually do not take advantage of this approach, being essentially based on the MLLR algorithm. Consequently, a model for environmental mismatch is not available and, even under constrained conditions a significant number of model parameters have to be updated. From a theoretical point of view only the noise model parameters need to be updated, being the clean speech ones unchanged by the environment. So, it can be advantageous to have a model for environmental mismatch. Additionally separating the additive and convolutional components means a separation between the environmental mismatch and speaker mismatch when the channel does not change for long periods. This approach was followed in the development of the algorithm proposed in this paper. One drawback sometimes attributed to the continuous adaptation approach is that recognition failures originate poor background estimates. This paper also proposes a MAP-like method to deal with this situation.
Keywords :
hidden Markov models; maximum likelihood decoding; maximum likelihood estimation; noise (working environment); speaker recognition; MAP-like method; MLLR algorithm; adaptation algorithm; background model estimation; environmental mismatch; external noise; hidden Markov model composition technique; maximum likelihood; noise compensation; noisy acoustic nonstationary environment; speaker adaptation; speaker mismatch; time constant convolutional component; time varying additive component; Acoustic noise; Additive noise; Background noise; Convolution; Hidden Markov models; Loudspeakers; Maximum likelihood estimation; Maximum likelihood linear regression; Parametric statistics; Working environment noise;
Conference_Titel :
Signal Processing and Its Applications, 2003. Proceedings. Seventh International Symposium on
Print_ISBN :
0-7803-7946-2
DOI :
10.1109/ISSPA.2003.1224728