DocumentCode :
1611792
Title :
Learning the Correspondence between Continuous Speeches and Motions
Author :
Natsuki, O. ; Arata, N. ; Yoshiaki, I.
Author_Institution :
Kyoto Inst. of Technol.
fYear :
2005
Firstpage :
202
Lastpage :
202
Abstract :
Summary form only given. Roy (1999) developed a computational model of early lexical learning to address three questions: First, how do infants discover linguistic units? Second, how do they learn perceptually-grounded semantic categories? And third, how do they learn to associate linguistic units with appropriate semantic categories? His model coupled speech recordings with static images of objects, and acquired a lexicon of shape names. Kaplan et al. (2001) presented a model for teaching names of actions to an enhanced version of AIBO. The AIBO had built-in speech recognition facilities and behaviors. In this paper, we try to build a system that learns the correspondence between continuous speeches and continuous motions without a built-in speech recognizer nor built-in behaviors. We teach RobotPHONE to respond to voices properly by taking its hands. For example, one says ´bye-bye´ to the RobotPHONE holding its hand and waving. From continuous input, the system must segment speech and discover acoustic units which correspond to words. The segmentation is done based on recurrent patterns which was found by incremental reference interval-free continuous DP (IRIFCDP) by Kiyama et al. (1996) and Utsunomiya et al. (2004), and we accelerate the IRIFCDP using ShiftCDP (Itoh and Tanaka, 2004). The system also segments motion by the accelerated IRIFCDP, and it memorizes co-occurring speech and motion patterns. Then, it can respond to taught words properly by detecting taught words in speech input by ShiftCDP. We gave a demonstration with a RobotPHONE at the conference. We expect that it can learn words in any languages because it has no built-in facilities specific to any language
Keywords :
acoustics; image motion analysis; learning (artificial intelligence); linguistics; robots; speech processing; AIBO; RobotPHONE; ShiftCDP; acoustic units; continuous motion; continuous speech; incremental reference interval-free continuous DP; language acquisition; lexical learning; linguistics; motion segmentation; semantic category; speech recognition; speech recording; speech segmentation; static object image; Acceleration; Computational modeling; Education; Educational robots; Natural languages; Neck; Pediatrics; Shape; Shoulder; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Development and Learning, 2005. Proceedings., The 4th International Conference on
Conference_Location :
Osaka
Print_ISBN :
0-7803-9226-4
Type :
conf
DOI :
10.1109/DEVLRN.2005.1490983
Filename :
1490983
Link To Document :
بازگشت