Title :
Automatic Complexity Control of Generalized Variable Parameter HMMs for Noise Robust Speech Recognition
Author :
Rongfeng Su ; Xunying Liu ; Lan Wang
Author_Institution :
Shenzhen Inst. of Adv. Technol., Chinese Univ. of Hong Kong, Shenzhen, China
Abstract :
An important part of the acoustic modelling problem for automatic speech recognition (ASR) systems is to handle the mismatch against a target environment created by time-varying external factors such as ambient noise. One possible solution to this problem is to introduce controllability to the underlying acoustic model to allow an instantaneous adaptation to the underlying noise condition. Along this line, the continuous trajectory of optimal, well matched model parameters against the varying noise can be explicitly modelled using, for example, generalized variable parameter HMMs (GVP-HMM). In order to improve the generalization and computational efficiency of conventional GVP-HMMs, this paper investigates a novel model complexity control method for GVP-HMMs. The optimal polynomial degrees of Gaussian mean, variance and model space linear transform trajectories are automatically determined at local level. Significant error rate reductions of 20% and 28% relative were obtained over the multi-style training baseline systems on Aurora 2 and a medium vocabulary Mandarin Chinese speech recognition task respectively. Consistent performance improvements and model size compression of 60% relative were also obtained over the baseline GVP-HMM systems using a uniformly assigned polynomial degree.
Keywords :
Gaussian processes; acoustic noise; acoustic signal processing; computational complexity; controllability; error statistics; hidden Markov models; polynomials; speech recognition; ASR; Aurora 2; GVP-HMM system; Gaussian mean; Mandarin Chinese speech recognition; acoustic modelling problem; automatic model complexity control; automatic speech recognition; continuous trajectory; controllability; error rate reduction; generalized variable parameter; matched model parameters; model space linear transform trajectory; multistyle training baseline system; noise robust speech recognition; optimal polynomial degrees; time-varying external factors; variance; Complexity theory; Hidden Markov models; Mathematical model; Noise; Polynomials; Trajectory; Complexity control; generalized variable parameter HMMs; robust speech recognition; variable noise;
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
DOI :
10.1109/TASLP.2014.2372901