Author_Institution :
Media Lab., MIT, Cambridge, MA, USA
Abstract :
Face-to-face interaction between people is generally effortless and effective. We exchange glances, take turns speaking and make facial and manual gestures to achieve the goals of the dialogue. This paper describes an action composition and selection architecture for synthetic characters capable of full-duplex, real-time face-to-face interaction with a human. This architecture is part of a computational model of psychosocial dialogue skills, called Y_m_i_r_, that bridges between multimodal perception and multimodal action generation. To test the architecture, a prototype humanoid has been implemented, named G_a_n_d_a_ l_f_, who commands a graphical model of the solar system and can engage in task-directed dialogue with people using speech, manual and facial gesture. Gandalf has been tested in interaction with users and has been shown capable of fluid turn-taking and multimodal dialogue. The primary focus in this paper will be on the action selection mechanisms and low-level composition of motor commands. An overview is also given of the Ymir model and Gandalf´s graphical representation
Keywords :
computer animation; natural language interfaces; software agents; virtual reality; Gandalf; Ymir model; action composition; action selection architecture; action selection mechanisms; animation; artificial intelligence; autonomous agents; communicative humanoids; fluid turn-taking; full-duplex real-time face-to-face interaction; layered modular action control; low-level composition; multimodal action generation; multimodal dialogue; multimodal perception; psychosocial dialogue skills; real-time dialogue; synthetic characters; task-directed dialogue; Bridges; Computational modeling; Computer architecture; Face; Graphical models; Humans; Prototypes; Psychology; Solar power generation; System testing;