Title :
Automatic determination of acoustic model topology using variational Bayesian estimation and clustering for large vocabulary continuous speech recognition
Author :
Watanabe, Shinji ; Sako, Atsushi ; Nakamura, Atsushi
Author_Institution :
NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan
fDate :
5/1/2006 12:00:00 AM
Abstract :
We describe the automatic determination of a large and complicated acoustic model for speech recognition by using variational Bayesian estimation and clustering (VBEC) for speech recognition. We propose an efficient method for decision tree clustering based on a Gaussian mixture model (GMM) and an efficient model search algorithm for finding an appropriate acoustic model topology within the VBEC framework. GMM-based decision tree clustering for triphone HMM states features a novel approach designed to reduce the overly large number of computations to a practical level by utilizing the statistics of monophone hidden Markov model states. The model search algorithm also reduces the search space by utilizing the characteristics of the acoustic model. The experimental results confirmed that VBEC automatically and rapidly yielded an optimum model topology with the highest performance.
Keywords :
Bayes methods; Gaussian processes; acoustics; hidden Markov models; speech processing; speech recognition; Gaussian mixture model; acoustic model topology; monophone hidden Markov model; search algorithm; variational Bayesian estimation and clustering; vocabulary continuous speech recognition; Bayesian methods; Clustering algorithms; Decision trees; Hidden Markov models; Humans; Iterative algorithms; Speech recognition; Statistics; Topology; Vocabulary; Determination of acoustic model topologies; speech recognition; variational Bayes; variational Bayesian estimation and clustering (VBEC);
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TSA.2005.857791