Title :
Frame-synchronous noise compensation for hands-free speech recognition in car environments
Author :
Chien, J.-T. ; Lin, M.-S.
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
fDate :
12/1/2000 12:00:00 AM
Abstract :
It has become increasingly important to develop hands-free speech recognition techniques for the human-computer interface in car environments. However, severe car noise degrades the speech recognition performance substantially. To compensate the performance loss, it is necessary to adapt the original speech hidden Markov models (HMMs) to meet changing car environments. A novel frame-synchronous adaptation mechanism for in-car speech recognition is presented. This mechanism is intended to perform unsupervised model adaptation efficiently on a frame-by-frame basis instead of a conventional adaptation algorithm relying on batch adaptation data and supervision information. The proposed adaptation scheme is performed during frame likelihood calculation where an optimal equalisation factor is first computed to equalise the model mean vector and the input frame vector. This equalisation factor then serves as a reference index to retrieve an additional bias vector for model mean adaptation. As a result, a rapid and flexible algorithm is exploited to establish a new robust likelihood measure. In experiments on hands-free in-car speech recognition with the microphone far from the talker, this framework is found to be effective in terms of recognition rate and computational cost under various driving speeds
Keywords :
acoustic noise; adaptive signal processing; automobiles; equalisers; hidden Markov models; optimisation; radiotelephony; speech recognition; HMM; bias vector; car noise; computational cost; driving speeds; experiments; frame likelihood calculation; frame-synchronous adaptation mechanism; frame-synchronous noise compensation; hands-free in-car speech recognition; hidden Markov models; human-computer interface; input frame vector; mean vector; microphone; model mean adaptation; optimal equalisation factor; performance loss compensation; recognition rate; reference index; robust likelihood measure; speech recognition performance; unsupervised model adaptation;
Journal_Title :
Vision, Image and Signal Processing, IEE Proceedings -
DOI :
10.1049/ip-vis:20000693