DocumentCode :
336738
Title :
Convolutional density estimation in hidden Markov models for speech recognition
Author :
Matsoukas, Spyros ; Zavaliagkos, George
Author_Institution :
BBN Technol./GTE Internetworking, Cambridge, UK
Volume :
1
fYear :
1999
fDate :
15-19 Mar 1999
Firstpage :
113
Abstract :
In continuous density hidden Markov models (HMMs) for speech recognition, the probability density function (PDF) for each state is usually expressed as a mixture of Gaussians. We present a model in which the PDF is expressed as the convolution of two densities. We focus on the special case where one of the convolved densities is a M-Gaussian mixture, and the other is a mixture of N impulses. We present the reestimation formulae for the parameters of the M×N convolutional model, and suggest two ways for initializing them, the residual K-Means approach, and the deconvolution from a standard HMM with MN Gaussians per state using a genetic algorithm to search for the optimal assignment of Gaussians. Both methods result in a compact representation that requires only 𝒪(M+N) storage space for the model parameters, and O(MN) time for training and decoding. We explain how the decoding time can be reduced to O(M+kN), where k<M. Finally, results are shown on the 1996 Hub-4 Development test, demonstrating that a 32×2 convolutional model can achieve performance comparable to that of a standard 64-Gaussian per state model
Keywords :
Gaussian processes; computational complexity; convolution; decoding; deconvolution; genetic algorithms; hidden Markov models; parameter estimation; probability; speech recognition; 1996 Hub-4 Development test; Gaussians mixture; HMM; M-Gaussian mixture; PDF; compact representation; continuous density hidden Markov models; convolutional density estimation; convolutional model parameters; decoding time; deconvolution; genetic algorithm; impulses; optimal Gaussians assignment; performance; probability density function; reestimation formulae; residual K-Means approach; speech recognition; storage space; training time; Convolution; Decoding; Deconvolution; Gaussian processes; Genetic algorithms; Hidden Markov models; Probability density function; Speech recognition; Standards development; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on
Conference_Location :
Phoenix, AZ
ISSN :
1520-6149
Print_ISBN :
0-7803-5041-3
Type :
conf
DOI :
10.1109/ICASSP.1999.758075
Filename :
758075
Link To Document :
بازگشت