Title :
A continuous density interpretation of discrete HMM systems and MMI-neural networks
Author :
Neukirchen, Christoph ; Rottland, Jorg ; Willett, Daniel ; Rigoll, Gerhard
Author_Institution :
Philips Res. Lab., Aachen, Germany
fDate :
5/1/2001 12:00:00 AM
Abstract :
The subject of this paper is the integration of the traditional vector quantizer (VQ) and discrete hidden Markov models (HMM) combination in the mixture emission density framework commonly used in automatic speech recognition (ASR). It is shown that the probability density of a system that consists of a VQ and a discrete classifier can be interpreted as a special case of a semi-continuous mixture model. Thus, the VQ parameters and the classifier can be trained jointly. In this framework, a gradient based VQ training method for single and multiple feature stream systems is derived. This leads to an approach that is directly related to the paradigm of maximum mutual information (MMI) neural networks, that has been successfully applied as VQ in ASR earlier. In continuous speech recognition experiments that were carried out for the Resource Management and Wall Street Journal databases the presented systems achieve recognition accuracies that compete well with comparable Gaussian mixture HMMs. Thus, we demonstrate that the performance degradations, often reported for discrete HMM systems, are not mainly caused by the vector quantization process in itself, but that they are due to the traditional separation of the VQ and the HMM during parameter estimation. These degradations can be avoided by training of the entire system as described here, while keeping the attractive computational speed of discrete HMMs
Keywords :
hidden Markov models; neural nets; parameter estimation; probability; signal classification; speech recognition; vector quantisation; Gaussian mixture HMM; MMI-neural networks; Resource Management database; VQ parameters; Wall Street Journal database; automatic speech recognition; computational speed; continuous density interpretation; continuous speech recognition; discrete HMM systems; discrete classifier; discrete hidden Markov models; gradient based VQ training method; maximum mutual information neural networks; mixture emission density; multiple feature stream system; parameter estimation; probability density; recognition accuracy; semi-continuous mixture model; single feature stream system; vector quantization; vector quantizer; Automatic speech recognition; Databases; Degradation; Hidden Markov models; Mutual information; Neural networks; Parameter estimation; Resource management; Speech recognition; Vector quantization;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on