مرکز منطقه ای اطلاع رساني علوم و فناوري - HMM adaptation using a phase-sensitive acoustic distortion model for environment-robust speech recognition

DocumentCode :

3422334

Title :

HMM adaptation using a phase-sensitive acoustic distortion model for environment-robust speech recognition

Author :

Li, Jinyu ; Deng, Li ; Yu, Dong ; Gong, Yifan ; Acero, Alex

Author_Institution :

Microsoft Corp., Redmond, WA

fYear :

2008

fDate :

March 31 2008-April 4 2008

Firstpage :

4069

Lastpage :

4072

Abstract :

In this paper, we present a new approach to HMM adaptation that jointly compensates for additive and convolutive acoustic distortion in environment-robust speech recognition. The hallmark of our new approach is the use of a nonlinear, phase-sensitive model of acoustic distortion that captures phase asynchrony between clean speech and the mixing noise. In the first step of the developed algorithm, both the static and dynamic portions of the noise and channel parameters are estimated in the cepstral domain, using the speech recognizer\´s "feedback" information and the vector-Taylor-series linearization technique on the nonlinear phase-sensitive model. In the second step, the estimated noise and channel parameters are used to effectively adapt the static and dynamic portions of the HMM means and variances also using the linearized phase-sensitive acoustic distortion model. In the experimental evaluation using the standard Aurora 2 task, the proposed new algorithm achieves 93.3% accuracy using the clean-trained complex HMM backend as the baseline system for unsupervised HMM adaptation. This reaches the highest performance number in the literature on this task with clean-trained HMM model. The experimental results show that the phase term, which was missing in all previous HMM-adaptation work, contributes significantly to the achieved high recognition accuracy.

Keywords :

acoustic distortion; hidden Markov models; series (mathematics); speech recognition; Aurora 2 task; HMM adaptation; additive distortion; convolutive acoustic distortion; environment-robust speech recognition; mixing noise; nonlinear phase-sensitive model; phase asynchrony; phase-sensitive acoustic distortion model; vector-Taylor-series linearization technique; Acoustic distortion; Acoustic noise; Cepstral analysis; Hidden Markov models; Parameter estimation; Phase estimation; Phase noise; Speech enhancement; Speech recognition; Working environment noise; additive and convolutive distortions; phase-sensitive distortion model; robust ASR; vector Taylor series;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on

Conference_Location :

Las Vegas, NV

ISSN :

1520-6149

Print_ISBN :

978-1-4244-1483-3

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2008.4518548

Filename :

4518548

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3422334