• DocumentCode
    1467908
  • Title

    A continuous density interpretation of discrete HMM systems and MMI-neural networks

  • Author

    Neukirchen, Christoph ; Rottland, Jorg ; Willett, Daniel ; Rigoll, Gerhard

  • Author_Institution
    Philips Res. Lab., Aachen, Germany
  • Volume
    9
  • Issue
    4
  • fYear
    2001
  • fDate
    5/1/2001 12:00:00 AM
  • Firstpage
    367
  • Lastpage
    377
  • Abstract
    The subject of this paper is the integration of the traditional vector quantizer (VQ) and discrete hidden Markov models (HMM) combination in the mixture emission density framework commonly used in automatic speech recognition (ASR). It is shown that the probability density of a system that consists of a VQ and a discrete classifier can be interpreted as a special case of a semi-continuous mixture model. Thus, the VQ parameters and the classifier can be trained jointly. In this framework, a gradient based VQ training method for single and multiple feature stream systems is derived. This leads to an approach that is directly related to the paradigm of maximum mutual information (MMI) neural networks, that has been successfully applied as VQ in ASR earlier. In continuous speech recognition experiments that were carried out for the Resource Management and Wall Street Journal databases the presented systems achieve recognition accuracies that compete well with comparable Gaussian mixture HMMs. Thus, we demonstrate that the performance degradations, often reported for discrete HMM systems, are not mainly caused by the vector quantization process in itself, but that they are due to the traditional separation of the VQ and the HMM during parameter estimation. These degradations can be avoided by training of the entire system as described here, while keeping the attractive computational speed of discrete HMMs
  • Keywords
    hidden Markov models; neural nets; parameter estimation; probability; signal classification; speech recognition; vector quantisation; Gaussian mixture HMM; MMI-neural networks; Resource Management database; VQ parameters; Wall Street Journal database; automatic speech recognition; computational speed; continuous density interpretation; continuous speech recognition; discrete HMM systems; discrete classifier; discrete hidden Markov models; gradient based VQ training method; maximum mutual information neural networks; mixture emission density; multiple feature stream system; parameter estimation; probability density; recognition accuracy; semi-continuous mixture model; single feature stream system; vector quantization; vector quantizer; Automatic speech recognition; Databases; Degradation; Hidden Markov models; Mutual information; Neural networks; Parameter estimation; Resource management; Speech recognition; Vector quantization;
  • fLanguage
    English
  • Journal_Title
    Speech and Audio Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6676
  • Type

    jour

  • DOI
    10.1109/89.917682
  • Filename
    917682