Dereverberation robust to speaker´s azimuthal orientation in multi-channel human-robot communication

Author

Gomez, Raquel ; Nakamura, Kentaro ; Nakadai, Kazuhiro

Author_Institution

Honda Res. Inst. Japan Ltd. Co., Wako, Japan

fYear

2013

fDate

3-7 Nov. 2013

Firstpage

3439

Lastpage

3444

Abstract

The acoustical dynamics of reverberation in an enclosed environment poses a problem to human-robot communication. Any change in the azimuthal orientation of the speaker contributes to unpredictable acoustical activity resulting in a degradation in the performance of the automatic speech recognition (ASR) system. Thus, dereverberation techniques need to address this issue prior to ASR. Dereverberation in multi-channel applications primarily evolves in the adoption of a suitable reverberant model that results to a computationally feasible solution and at the same time yields an accurate estimate of the harmful reflections (i.e., late reflection) for effective suppression. In this paper we address this problem by introducing a hybrid method based on multi-channel processing on a singlechannel reverberant model platform. The proposed method is capable of accurate signal estimation, a property inherent to a multi-channel system, and at the same time bears the computational efficiency derived from single-channel reverberant model approach. The proposed method is summarized as follows; First, multi-channel sound-source processing is employed to obtain the full reverberant and the late reflection signal estimates. Then, equalization is employed to update the late reflection estimate reflective of the change in azimuth prior to dereverberation. The equalization parameters for azimuthal change are obtained through an offline optimization procedure. Experimental evaluation in an actual human-robot communication environment shows that the proposed method outperforms existing methods in terms of robustness in the ASR performance.

Keywords

acoustic signal processing; human-robot interaction; reverberation; speaker recognition; speech processing; ASR; automatic speech recognition system; azimuthal change; computational efficiency; dereverberation; multichannel human-robot communication; multichannel sound-source processing; offline optimization procedure; reverberation acoustical dynamics; signal estimation; single-channel reverberant model platform; speaker azimuthal orientation; speech-based human-robot interaction; unpredictable acoustical activity; Azimuth; Computational modeling; Microphones; Reverberation; Robots; Speech; Training;

fLanguage

English

Publisher

ieee

Conference_Titel

Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ International Conference on

Conference_Location

Tokyo

ISSN

2153-0858

Type

conf

DOI

10.1109/IROS.2013.6696846

Filename

6696846