Title :
Unsupervised noise model estimation for model-based robust speech recognition
Author :
Graciarena, Martin ; Franco, Horacio
Author_Institution :
Speech Technol. & Res. Lab., SRI Int., USA
fDate :
30 Nov.-3 Dec. 2003
Abstract :
Within the framework of a generalization of Rose´s integrated parametric model (IPM) to the Gaussian mixture hidden Markov model (HMM) formulation to model speech in noisy environments, we further extend an algorithm for the maximum likelihood (ML) estimation of the noise HMM component. We propose: (1) a gain normalization algorithm in the log filterbank domain, which uses a gain estimate that is less degraded in broadband noise conditions; (2) the use of an augmented feature to incorporate dynamic information, which enables the use of the same probability density computation as the instantaneous feature; and (3) a technique for unsupervised noise model estimation using a phone loop grammar, which does not require an initial recognition pass. This algorithm does not require speech/nonspeech detection. In noisy digit recognition experiments, using HTK and NOISEX-92 databases, the noise estimation algorithm achieves, in supervised and unsupervised cases, performance similar to using noise models trained with the noise source data only.
Keywords :
acoustic noise; grammars; hidden Markov models; maximum likelihood estimation; speech recognition; Gaussian mixture hidden Markov model; Rose integrated parametric model; broadband noise; dynamic information incorporation; log filterbank gain normalization; maximum likelihood estimation; model-based robust speech recognition; noise HMM component; noisy digit recognition; noisy environment speech; phone loop grammar; probability density computation; unsupervised noise model estimation; Filter bank; Gaussian noise; Hidden Markov models; Maximum likelihood detection; Maximum likelihood estimation; Noise robustness; Parametric statistics; Speech enhancement; Speech recognition; Working environment noise;
Conference_Titel :
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN :
0-7803-7980-2
DOI :
10.1109/ASRU.2003.1318466