DocumentCode
2948210
Title
Temporal smearing compensation in reverberant environment for speech-based human-robot interaction
Author
Gomez, Randy ; Nakamura, Keisuke ; Mizumoto, Takeshi ; Nakadai, Kazuhiro
Author_Institution
Honda Res. Inst. Japan Ltd. Co., Wako, Japan
fYear
2015
fDate
26-30 May 2015
Firstpage
3347
Lastpage
3353
Abstract
Speech-based human-robot interaction is often plagued with issues such as reverberation and changes in speaker position that impacts overall performance. In this paper, we show a method in compensating the joint effects of reverberation and the change in speaker position. The acoustic perturbation caused by these two takes its toll on the Automatic Speech Recognition (ASR) and then the Spoken Language Understanding (SLU). Consequently, these will lead to a failure in the human-robot interaction experience. The proposed method is specifically designed to address the challenging environment condition in which robots are deployed. First, we analyze the impact of reverberation in the form of temporal smearing per change in speaker position. Then, we extract the smearing coefficients that capture the joint dynamics between the speech signal at current position and the room acoustics as observed by the robot. These coefficients are utilized to update the room transfer function (RTF) and the suppression parameters are stored offline. Moreover, all of these processes are optimized in the context of the ASR system for robot application. In the online mode, the reverberant data at an arbitrary position is processed using the parameters pre-computed offline. This effectively compensates the joint effects of reverberation at the arbitrary speaker position. Experimental results using real data gathered in a human-robot communication setting show that the proposed method outperforms existing methods.
Keywords
acoustic signal processing; human-robot interaction; perturbation techniques; reverberation; speech recognition; speech-based user interfaces; ASR system; RTF; SLU; acoustic perturbation; automatic speech recognition; joint dynamics; online mode; reverberant environment; room transfer function; smearing coefficients; speaker position; speech-based human-robot interaction; spoken language understanding; suppression parameters; temporal smearing compensation; Databases; Hidden Markov models; Joints; Reverberation; Robots; Speech; Automatic Speech Recognition; Dereverberation; Robustness; Speech Enhancement;
fLanguage
English
Publisher
ieee
Conference_Titel
Robotics and Automation (ICRA), 2015 IEEE International Conference on
Conference_Location
Seattle, WA
Type
conf
DOI
10.1109/ICRA.2015.7139661
Filename
7139661
Link To Document