Aggregate a posteriori linear regression adaptation

Author

Chien, Jen-Tzung ; Huang, Chih-Hsien

Author_Institution

Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan

Volume

14

Issue

3

fYear

2006

fDate

5/1/2006 12:00:00 AM

Firstpage

797

Lastpage

807

Abstract

We present a new discriminative linear regression adaptation algorithm for hidden Markov model (HMM) based speech recognition. The cluster-dependent regression matrices are estimated from speaker-specific adaptation data through maximizing the aggregate a posteriori probability, which can be expressed in a form of classification error function adopting the logarithm of posterior distribution as the discriminant function. Accordingly, the aggregate a posteriori linear regression (AAPLR) is developed for discriminative adaptation where the classification errors of adaptation data are minimized. Because the prior distribution of regression matrix is involved, AAPLR is geared with the Bayesian learning capability. We demonstrate that the difference between AAPLR discriminative adaptation and maximum a posteriori linear regression (MAPLR) adaptation is due to the treatment of the evidence. Different from minimum classification error linear regression (MCELR), AAPLR has closed-form solution to fulfil rapid adaptation. Experimental results reveal that AAPLR speaker adaptation does improve speech recognition performance with moderate computational cost compared to maximum likelihood linear regression (MLLR), MAPLR, MCELR and conditional maximum likelihood linear regression (CMLLR). These results are verified for supervised adaptation as well as unsupervised adaptation for different numbers of adaptation data.

Keywords

belief networks; hidden Markov models; learning (artificial intelligence); matrix algebra; maximum likelihood estimation; regression analysis; speech recognition; Bayesian learning; HMM; aggregate a posteriori linear regression adaptation; cluster-dependent regression matrices; discriminative linear regression adaptation algorithm; hidden Markov model; maximum a posteriori linear regression; speaker-specific adaptation data; speech recognition; unsupervised adaptation; Aggregates; Bayesian methods; Clustering algorithms; Hidden Markov models; Linear regression; Maximum likelihood estimation; Maximum likelihood linear regression; Natural languages; Robustness; Speech recognition; Aggregate a posteriori criterion; Bayesian learning; discriminative adaptation; linear regression adaptation; speaker adaptation; speech recognition;

fLanguage

English

Journal_Title

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher

ieee

ISSN

1558-7916

Type

jour

DOI

10.1109/TSA.2005.860847

Filename

1621195