DocumentCode :
62390
Title :
Linear Regression Based Acoustic Adaptation for the Subspace Gaussian Mixture Model
Author :
Ghalehjegh, Sina Hamidi ; Rose, Richard C.
Author_Institution :
Dept. of Electr. & Comput. Eng., McGill Univ., Montreal, QC, Canada
Volume :
22
Issue :
9
fYear :
2014
fDate :
Sept. 2014
Firstpage :
1391
Lastpage :
1402
Abstract :
This paper presents a study of two acoustic speaker adaptation techniques applied in the context of the subspace Gaussian mixture model (SGMM) for automatic speech recognition (ASR). First, a model space linear regression based approach is presented for adaptation of SGMM state projection vectors and is referred to as subspace vector adaptation (SVA). Second, an easy to implement realization of constrained maximum likelihood linear regression (CMLLR) is presented for feature space adaptation in the SGMM. Numerically stable procedures for row-by-row estimation of the regression based transformation matrices are presented for both SVA and CMLLR adaptation. These approaches are applied to SGMM models that are estimated using speaker adaptive training (SAT), a technique for estimating more compact speaker independent acoustic models. Unsupervised speaker adaptation performance is evaluated on conversational and read speech task domains and compared to unsupervised adaptation performance obtained using the hidden Markov model-Gaussian mixture model (HMM-GMM) in ASR. It is shown that the feature space and model space adaptation approaches applied to the SGMM provide complementary reductions in word error rate (WER) and provide lower WERs than that obtained using CMLLR adaptation for the HMM-GMM.
Keywords :
Gaussian processes; error statistics; hidden Markov models; mixture models; regression analysis; speech recognition; CMLLR; HMM-GMM; SGMM; SVA; WER; acoustic adaptation; acoustic speaker adaptation; automatic speech recognition; constrained maximum likelihood linear regression; feature space adaptation; hidden Markov model-Gaussian mixture model; model space linear regression; speaker adaptive training; subspace Gaussian mixture model; subspace vector adaptation; word error rate; Acoustics; Adaptation models; Covariance matrices; Hidden Markov models; Linear regression; Speech; Vectors; Automatic speech recognition; constrained maximum likelihood linear regression; speaker adaptation; subspace modeling;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
2329-9290
Type :
jour
DOI :
10.1109/TASLP.2014.2332043
Filename :
6840365
Link To Document :
بازگشت