Title :
Bayesian Adaptive Inference and Adaptive Training
Author :
Yu, Kai ; Gales, Mark J F
Author_Institution :
Cambridge Univ., Cambridge
Abstract :
Large-vocabulary speech recognition systems are often built using found data, such as broadcast news. In contrast to carefully collected data, found data normally contains multiple acoustic conditions, such as speaker or environmental noise. Adaptive training is a powerful approach to build systems on such data. Here, transforms are used to represent the different acoustic conditions, and then a canonical model is trained given this set of transforms. This paper describes a Bayesian framework for adaptive training and inference. This framework addresses some limitations of standard maximum-likelihood approaches. In contrast to the standard approach, the adaptively trained system can be directly used in unsupervised inference, rather than having to rely on initial hypotheses being present. In addition, for limited adaptation data, robust recognition performance can be obtained. The limited data problem often occurs in testing as there is no control over the amount of the adaptation data available. In contrast, for adaptive training, it is possible to control the system complexity to reflect the available data. Thus, the standard point estimates may be used. As the integral associated with Bayesian adaptive inference is intractable, various marginalization approximations are described, including a variational Bayes approximation. Both batch and incremental modes of adaptive inference are discussed. These approaches are applied to adaptive training of maximum-likelihood linear regression and evaluated on a large-vocabulary speech recognition task. Bayesian adaptive inference is shown to significantly outperform standard approaches.
Keywords :
Bayes methods; acoustic signal processing; inference mechanisms; maximum likelihood estimation; regression analysis; speech recognition; transforms; unsupervised learning; Bayesian adaptive inference; adaptive training; environmental noise; large-vocabulary speech recognition; maximum-likelihood linear regression; speaker noise; transforms; unsupervised inference; variational Bayes approximation; Bayesian methods; Broadcasting; Loudspeakers; Maximum likelihood estimation; Power system modeling; Programmable control; Robustness; Speech recognition; Testing; Working environment noise; Adaptive training; Bayesian adaptation; Bayesian inference; incremental; variational Bayes;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2007.901300