• DocumentCode
    1862655
  • Title

    A new hybrid system based on MMI-neural networks for the RM speech recognition task

  • Author

    Rigoll, Gerhard ; Neukirchen, Christoph ; Rottland, J.

  • Author_Institution
    Dept. of Comput. Sci., Gerhard-Mercator-Univ. Duisberg, Germany
  • Volume
    2
  • fYear
    1996
  • fDate
    7-10 May 1996
  • Firstpage
    865
  • Abstract
    We present a hybrid speech recognition system for speaker independent continuous speech recognition. The system combines a novel information theory based neural network (NN) paradigm and discrete Hidden Markov models (HMMs) including state-of-the-art techniques like state clustered triphones. The novel NN type is trained by an algorithm based on principles of self-organization that achieves maximum mutual information between the generated output labels and the basic phonetic classes. The structure of the hybrid system is quite similar to a classical VQ-HMM system but the vector quantizer (VQ) is replaced by the NN. To evaluate the system we use the speaker independent part of the resource management (RM) database. We obtained an important improvement by introducing a novel kind of context dependent basic classes used by the acoustic processor. The average RM recognition result with a word-pair grammar is now 95.2% what is significantly better than a classical VQ-system, slightly better than a different hybrid system with a recurrent network as probability estimator, and very close to the best continuous probability density function (PDF) HMM speech recognizers
  • Keywords
    hidden Markov models; information theory; probability; self-organising feature maps; speech recognition; MMI neural networks; PDF; RM speech recognition task; acoustic processor; classical VQ-HMM system; context dependent basic classes; continuous probability density function; discrete Hidden Markov models; generated output labels; hybrid speech recognition system; information theory; maximum mutual information; phonetic classes; probability estimator; recurrent network; resource management database; self-organization; speaker independent recognition; state clustered triphones; word-pair grammar; Clustering algorithms; Databases; Hidden Markov models; Information theory; Loudspeakers; Mutual information; Neural networks; Probability density function; Resource management; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on
  • Conference_Location
    Atlanta, GA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-3192-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.1996.543258
  • Filename
    543258