• DocumentCode
    294614
  • Title

    Discrete MMI probability models for HMM speech recognition

  • Author

    Foote, J.T.

  • Author_Institution
    Dept. of Eng., Cambridge Univ., UK
  • Volume
    1
  • fYear
    1995
  • fDate
    9-12 May 1995
  • Firstpage
    461
  • Abstract
    This paper presents a method of non-parametrically modeling HMM output probabilities. Discrete output probabilities are estimated from a tree-based maximum mutual information (MMI) partition of the feature space, rather than the usual vector quantization. One advantage of a decision-tree method is that very high-dimensional spaces can be partitioned. Time variation can then be explicitly modeled by concatenating time-adjacent vectors, which is shown to improve recognition performance. Though the model is discrete, it provides recognition performance better than i-component Gaussian mixture HMMs on the ARPA Resource Management (RM) task. This method is not without drawbacks: because of its non-parametric nature, a large number of parameters are needed for a good model and the available RM training data is probably not sufficient. Besides the computational advantages of a discrete model, this method has promising applications in talker identification, adaptation, and clustering
  • Keywords
    feature extraction; hidden Markov models; information theory; probability; quantisation (signal); speech processing; speech recognition; trees (mathematics); ARPA Resource Management task; HMM output probabilities; HMM speech recognition; clustering; decision-tree method; discrete MMI probability models; discrete output probabilities; feature space; maximum mutual information; nonparametric modeling; recognition performance; talker adaptation; talker identification; time variation; time-adjacent vectors; training data; tree-based MMI partition; Acoustic distortion; Cost function; Decision trees; Greedy algorithms; Hidden Markov models; Mutual information; Quantization; Resource management; Robustness; Speech recognition; Training data; Viterbi algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
  • Conference_Location
    Detroit, MI
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-2431-5
  • Type

    conf

  • DOI
    10.1109/ICASSP.1995.479628
  • Filename
    479628