مرکز منطقه ای اطلاع رساني علوم و فناوري - Learning the Correspondence between Continuous Speeches and Motions

DocumentCode :

1611792

Title :

Learning the Correspondence between Continuous Speeches and Motions

Author :

Natsuki, O. ; Arata, N. ; Yoshiaki, I.

Author_Institution :

Kyoto Inst. of Technol.

fYear :

2005

Firstpage :

202

Lastpage :

202

Abstract :

Summary form only given. Roy (1999) developed a computational model of early lexical learning to address three questions: First, how do infants discover linguistic units? Second, how do they learn perceptually-grounded semantic categories? And third, how do they learn to associate linguistic units with appropriate semantic categories? His model coupled speech recordings with static images of objects, and acquired a lexicon of shape names. Kaplan et al. (2001) presented a model for teaching names of actions to an enhanced version of AIBO. The AIBO had built-in speech recognition facilities and behaviors. In this paper, we try to build a system that learns the correspondence between continuous speeches and continuous motions without a built-in speech recognizer nor built-in behaviors. We teach RobotPHONE to respond to voices properly by taking its hands. For example, one says ´bye-bye´ to the RobotPHONE holding its hand and waving. From continuous input, the system must segment speech and discover acoustic units which correspond to words. The segmentation is done based on recurrent patterns which was found by incremental reference interval-free continuous DP (IRIFCDP) by Kiyama et al. (1996) and Utsunomiya et al. (2004), and we accelerate the IRIFCDP using ShiftCDP (Itoh and Tanaka, 2004). The system also segments motion by the accelerated IRIFCDP, and it memorizes co-occurring speech and motion patterns. Then, it can respond to taught words properly by detecting taught words in speech input by ShiftCDP. We gave a demonstration with a RobotPHONE at the conference. We expect that it can learn words in any languages because it has no built-in facilities specific to any language

Keywords :

acoustics; image motion analysis; learning (artificial intelligence); linguistics; robots; speech processing; AIBO; RobotPHONE; ShiftCDP; acoustic units; continuous motion; continuous speech; incremental reference interval-free continuous DP; language acquisition; lexical learning; linguistics; motion segmentation; semantic category; speech recognition; speech recording; speech segmentation; static object image; Acceleration; Computational modeling; Education; Educational robots; Natural languages; Neck; Pediatrics; Shape; Shoulder; Speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Development and Learning, 2005. Proceedings., The 4th International Conference on

Conference_Location :

Osaka

Print_ISBN :

0-7803-9226-4

Type :

conf

DOI :

10.1109/DEVLRN.2005.1490983

Filename :

1490983

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1611792