Mixed Bayesian networks with auxiliary variables for automatic speech recognition

Author

Stephenson, Todd A. ; Magimai-Doss, Mathew ; Bourlard, Hervé

Author_Institution

Dalle Molle Inst. for Perceptual Artificial Intelligence, Martigny, Switzerland

Volume

4

fYear

2002

fDate

2002

Firstpage

293

Abstract

In standard automatic speech recognition (ASR), hidden Markov models (HMMs) calculate their emission probabilities by an artificial neural network (ANN) or a Gaussian distribution conditioned only upon the hidden state variable. Stephenson et al. (2001) showed the benefit of conditioning the emission distributions also upon a discrete auxiliary variable, which is observed in training and hidden in recognition. Related work (Fujinaga et al., 2001) has shown the utility of conditioning the emission distributions on a continuous auxiliary variable. We apply mixed Bayesian networks (BNs) to extend these works by introducing a continuous auxiliary variable that is observed in training but is hidden in recognition. We find that an auxiliary pitch variable conditioned itself upon the hidden state can degrade performance unless the auxiliary variable is also hidden. The performance, furthermore, can be improved by making the auxiliary pitch variable independent of the hidden state.

Keywords

Gaussian distribution; belief networks; hidden Markov models; speech recognition; automatic speech recognition; continuous auxiliary variable; emission distributions; hidden Markov models; mixed Bayesian networks; pitch variable; Acoustic emission; Artificial intelligence; Artificial neural networks; Automatic speech recognition; Bayesian methods; Degradation; Gaussian distribution; Hidden Markov models; Integrated circuit modeling; Pattern recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Pattern Recognition, 2002. Proceedings. 16th International Conference on

ISSN

1051-4651

Print_ISBN

0-7695-1695-X

Type

conf

DOI

10.1109/ICPR.2002.1047454

Filename

1047454