DocumentCode :
3709951
Title :
Robot-Audition-based Human-Machine Interface for a Car
Author :
Kazuhiro Nakadai;Takeshi Mizumoto;Keisuke Nakamura
Author_Institution :
Honda Research Institute Japan Co., Ltd., 8-1 Honcho, Wako, Saitama 351-0188, JAPAN
fYear :
2015
fDate :
9/1/2015 12:00:00 AM
Firstpage :
6129
Lastpage :
6136
Abstract :
This paper describes a Robot-Audition based Car Human Machine Interface (RA-CHMI). A RA-CHMI, like a car navigation system, has difficulty dealing with voice commands, since there are many noise sources in a car, including road noise, air-conditioner, music, and passengers. Microphone array processing developed in robot audition, may overcome this problem. Robot audition techniques, including sound source localization, Voice Activity Detection (VAD), sound source separation, and barge-in-able processing, were introduced by considering the characteristics of RA-CHMI. Automatic Speech Recognition (ASR), based on a Deep Neural Network (DNN), improved recognition performance and robustness in a noisy environment. In addition, as an integrated framework, HARK-Dialog was developed to build a multi-party and multi-modal dialog system, enabling the seamless use of cloud and local services with pluggable modular architecture. The constructed multi-party and multimodal RA-CHMI system did not require a push-to-talk button, nor did it require reducing the audio volume or air-conditioner when issuing speech commands. It could also control a four-DOF robot agent to make the system´s responses more understandable. The proposed RA-CHMI was validated by evaluating essential techniques in the system, such as VAD and DNN-ASR, using real speech data recorded during driving. The entire design of the RA-CHMI system, including the system response time and the proper use of cloud/local services, are also discussed.
Keywords :
"Robots","Microphones","Arrays","Speech","Vehicles","Speech recognition","Roads"
Publisher :
ieee
Conference_Titel :
Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on
Type :
conf
DOI :
10.1109/IROS.2015.7354250
Filename :
7354250
Link To Document :
بازگشت