مرکز منطقه ای اطلاع رساني علوم و فناوري - Robustness to speaker position in distant-talking automatic speech recognition

DocumentCode :

1688022

Title :

Robustness to speaker position in distant-talking automatic speech recognition

Author :

Gomez, Raquel ; Nakamura, Kentaro ; Nakadai, Kazuhiro

Author_Institution :

Honda Res. Inst. Japan Co. Ltd., Japan

fYear :

2013

Firstpage :

7034

Lastpage :

7038

Abstract :

In this paper, we show a method that significantly improved our previous work in single-channel dereverberation. The proposed method is more robust to changes in speaker position in distant talking ASR. First, we update the room transfer function (RTF) and weighting parameters for dereverberation to the target speaker position. This scheme corrects speech power variation as a function of position in the waveform level. Consequently, its impact to the acoustic model is verified. Then, we implement a fast acoustic model update reflective of the speech power level of the target speaker position. Furthermore, the scheme in updating the model is simple and precludes time-consuming model re-estimation. As a result, the proposed method can be executed online. The synergy of these corrective measures significantly minimizes the mismatch between training and testing conditions. We test our method using real reverberant data with different locations inside the room. Experimental results show that the proposed method outperforms the conventional methods in terms of ASR performance. Moreover, our fast acoustic model update scheme is at par in terms of recognition performance against time-consuming model re-estimation.

Keywords :

reverberation; speaker recognition; speech enhancement; RTF; acoustic model; distant talking ASR; distant-talking automatic speech recognition; fast acoustic model update scheme; room transfer function; single-channel dereverberation; speaker position; speech enhancement; speech power level; speech power variation; testing conditions; time-consuming model reestimation; training conditions; waveform level; weighting parameters; Acoustics; Adaptation models; Data models; Hidden Markov models; Robustness; Speech; Speech enhancement; Automatic Speech Recognition; Dereverberation; Robustness; Speech Enhancement;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on

Conference_Location :

Vancouver, BC

ISSN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2013.6639026

Filename :

6639026

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1688022