مرکز منطقه ای اطلاع رساني علوم و فناوري - Audio-visual integration for human-robot interaction in multi-person scenarios

DocumentCode :

1792730

Title :

Audio-visual integration for human-robot interaction in multi-person scenarios

Author :

Quang Nguyen ; Sang-Seok Yun ; JongSuk Choi

Author_Institution :

Center for Bionics, Korea Inst. of Sci. & Technol. (KIST), Seoul, South Korea

fYear :

2014

fDate :

16-19 Sept. 2014

Firstpage :

Lastpage :

Abstract :

This paper presents the integration of audio-visual perception components for human robot interaction in the Robot Operating System (ROS). Visual-based nodes consist of skeleton tracking and gesture recognition using a depth camera, and face recognition using an RGB camera. Auditory perception is based on sound source localization using a microphone array. We present an integration framework of these nodes using a top-down hierarchical messaging protocol. On the top of the integration, a message carries information about the number of persons and their corresponding states (who, what, where), which are updated from many low-level perception nodes. The top message is passed to a planning node to make a reaction of the robot, according to the perception about surrounding people. This paper demonstrates human-robot interaction in multi-persons scenario where robot pays its attention to the speaking or waving hand persons. Moreover, this modularization architecture enables reusing modules for other applications. To validate this approach, two sound source localization algorithms are evaluated in real-time where ground-truth localization is provided by the face recognition module.

Keywords :

acoustic signal detection; face recognition; gesture recognition; hearing; human-robot interaction; object tracking; protocols; RGB camera; Robot Operating System; audio-visual perception component integration; auditory perception; depth camera; face recognition; gesture recognition; ground-truth localization; human-robot interaction; microphone array; modularization architecture; multiperson scenarios; skeleton tracking; sound source localization algorithm; top-down hierarchical messaging protocol; Face; Face recognition; Human-robot interaction; Microphones; Robot sensing systems; ROS; audio-visual perception; human-robot interaction; sound source localization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Emerging Technology and Factory Automation (ETFA), 2014 IEEE

Conference_Location :

Barcelona

Type :

conf

DOI :

10.1109/ETFA.2014.7005303

Filename :

7005303

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1792730