DocumentCode
663849
Title
Dereverberation robust to speaker´s azimuthal orientation in multi-channel human-robot communication
Author
Gomez, Raquel ; Nakamura, Kentaro ; Nakadai, Kazuhiro
Author_Institution
Honda Res. Inst. Japan Ltd. Co., Wako, Japan
fYear
2013
fDate
3-7 Nov. 2013
Firstpage
3439
Lastpage
3444
Abstract
The acoustical dynamics of reverberation in an enclosed environment poses a problem to human-robot communication. Any change in the azimuthal orientation of the speaker contributes to unpredictable acoustical activity resulting in a degradation in the performance of the automatic speech recognition (ASR) system. Thus, dereverberation techniques need to address this issue prior to ASR. Dereverberation in multi-channel applications primarily evolves in the adoption of a suitable reverberant model that results to a computationally feasible solution and at the same time yields an accurate estimate of the harmful reflections (i.e., late reflection) for effective suppression. In this paper we address this problem by introducing a hybrid method based on multi-channel processing on a singlechannel reverberant model platform. The proposed method is capable of accurate signal estimation, a property inherent to a multi-channel system, and at the same time bears the computational efficiency derived from single-channel reverberant model approach. The proposed method is summarized as follows; First, multi-channel sound-source processing is employed to obtain the full reverberant and the late reflection signal estimates. Then, equalization is employed to update the late reflection estimate reflective of the change in azimuth prior to dereverberation. The equalization parameters for azimuthal change are obtained through an offline optimization procedure. Experimental evaluation in an actual human-robot communication environment shows that the proposed method outperforms existing methods in terms of robustness in the ASR performance.
Keywords
acoustic signal processing; human-robot interaction; reverberation; speaker recognition; speech processing; ASR; automatic speech recognition system; azimuthal change; computational efficiency; dereverberation; multichannel human-robot communication; multichannel sound-source processing; offline optimization procedure; reverberation acoustical dynamics; signal estimation; single-channel reverberant model platform; speaker azimuthal orientation; speech-based human-robot interaction; unpredictable acoustical activity; Azimuth; Computational modeling; Microphones; Reverberation; Robots; Speech; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ International Conference on
Conference_Location
Tokyo
ISSN
2153-0858
Type
conf
DOI
10.1109/IROS.2013.6696846
Filename
6696846
Link To Document