DocumentCode :
2787610
Title :
Robot-directed speech detection using Multimodal Semantic Confidence based on speech, image, and motion
Author :
Zuo, Xiang ; Iwahashi, Naoto ; Taguchi, Ryo ; Matsuda, Shigeki ; Sugiura, Komei ; Funakoshi, Kotaro ; Nakano, Mikio ; Oka, Natsuki
Author_Institution :
Adv. Telecommun. Res. Labs., Kyoto, Japan
fYear :
2010
fDate :
14-19 March 2010
Firstpage :
2458
Lastpage :
2461
Abstract :
In this paper, we propose a novel method to detect robot-directed (RD) speech that adopts the Multimodal Semantic Confidence (MSC) measure. The MSC measure is used to decide whether the speech can be interpreted as a feasible action under the current physical situation in an object manipulation task. This measure is calculated by integrating speech, image, and motion confidence measures with weightings that are optimized by logistic regression. Experimental results show that, compared with a baseline method that uses speech confidence only, MSC achieved an absolute increase of 5% for clean speech and 12% for noisy speech in terms of average maximum F-measure.
Keywords :
human-robot interaction; motion estimation; object recognition; regression analysis; robots; speech processing; human-robot interaction; image measure; logistic regression; motion measure; multimodal semantic confidence; object manipulation; robot directed speech detection; Acoustic signal detection; Current measurement; Face detection; Gas detectors; Humans; Motion detection; Motion measurement; Robots; Speech analysis; Speech recognition; human-robot interaction; multimodal semantic confidence; robot-directed speech detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
ISSN :
1520-6149
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2010.5494889
Filename :
5494889
Link To Document :
بازگشت