Hierarchical mixtures of experts methodology applied to continuous speech recognition

Author

Zhao, Yitig ; Schwartz, Richard ; Sroka, Jason ; Makhoul, Johri

Author_Institution

BBN Syst. & Technol. Corp., Cambridge, MA, USA

fYear

1995

fDate

31 Aug-2 Sep 1995

Firstpage

263

Lastpage

271

Abstract

In this paper, we incorporate the hierarchical mixtures of experts (HME) method of probability estimation, developed by Jordan (1994), into a hidden Markov model (HMM)-based continuous speech recognition system. The resulting system can be thought of as a continuous-density HMM system, but instead of using Gaussian mixtures, the HME system employs a large set of hierarchically organized but relatively small neural networks to perform the probability density estimation. The hierarchical structure is reminiscent of a decision tree except for two important differences: each “expert” or neural net performs a “soft” decision rather than a hard decision, and, unlike ordinary decision trees, the parameters of all the neural nets in the HME are automatically trainable using the expectation-maximisation algorithm. We report results on the ARPA 5,000-word and 40,000-word Wall Street Journal corpus using HME models

Keywords

decision theory; estimation theory; hidden Markov models; hierarchical systems; neural nets; probability; speech recognition; continuous speech recognition; decision trees; expectation-maximisation algorithm; hidden Markov model; hierarchical mixtures of experts; hierarchical structure; neural networks; probability density estimation; Classification tree analysis; Decision trees; Equations; Hidden Markov models; Large-scale systems; Neural networks; Speech recognition; State estimation; Training data;

fLanguage

English

Publisher

ieee

Conference_Titel

Neural Networks for Signal Processing [1995] V. Proceedings of the 1995 IEEE Workshop

Conference_Location

Cambridge, MA

Print_ISBN

0-7803-2739-X

Type

conf

DOI

10.1109/NNSP.1995.514900

Filename

514900