مرکز منطقه ای اطلاع رساني علوم و فناوري - Making a robot recognize three simultaneous sentences in real-time

DocumentCode :

2594600

Title :

Making a robot recognize three simultaneous sentences in real-time

Author :

Yamamoto, Shun´ichi ; Nakadai, Kazuhiro ; Valin, Jean-Marc ; Rouat, Jean ; Michaud, François ; Komatani, Kazunori ; Ogata, Tetsuya ; Okuno, Hiroshi G.

Author_Institution :

Graduate Sch. of Informatics, Kyoto Univ., Japan

fYear :

2005

fDate :

2-6 Aug. 2005

Firstpage :

4040

Lastpage :

4045

Abstract :

A humanoid robot under real-world environments usually hears mixtures of sounds, and thus three capabilities are essential for robot audition; sound source localization, separation, and recognition of separated sounds. We have adopted the missing feature theory (MFT) for automatic recognition of separated speech, and developed the robot audition system. A microphone array is used along with a real-time dedicated implementation of geometric source separation (GSS) and a multi-channel post-filter that gives us a further reduction of interferences from other sources. The automatic speech recognition based on MFT recognizes separated sounds by generating missing feature masks automatically from the post-filtering step. The main advantage of this approach for humanoid robots resides in the fact that the ASR with a clean acoustic model can adapt the distortion of separated sound by consulting the post-filter feature masks. In this paper, we used the improved Julius as an MFT-based automatic speech recognizer (ASR). The Julius is a real-time large vocabulary continuous speech recognition (LVCSR) system. We performed the experiment to evaluate our robot audition system. In this experiment, the system recognizes a sentence, not an isolated word. We showed the improvement in the system performance through three simultaneous speech recognition on the humanoid SIG2.

Keywords :

humanoid robots; microphone arrays; real-time systems; source separation; speech recognition; automatic missing feature mask generation; automatic speech recognition; geometric source separation; humanoid robot; large vocabulary continuous speech recognition; microphone array; missing feature theory; multichannel post-filter; real-time system; real-world environment; robot audition system; sound distortion; sound recognition; sound separation; sound source localization; Acoustic distortion; Automatic speech recognition; Humanoid robots; Interference; Microphone arrays; Real time systems; Robotics and automation; Source separation; Speech recognition; Vocabulary; automatic missing feature mask generation; continuous speech recognition; missing feature theory; robot audition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Intelligent Robots and Systems, 2005. (IROS 2005). 2005 IEEE/RSJ International Conference on

Print_ISBN :

0-7803-8912-3

Type :

conf

DOI :

10.1109/IROS.2005.1545094

Filename :

1545094

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2594600