DocumentCode
2787610
Title
Robot-directed speech detection using Multimodal Semantic Confidence based on speech, image, and motion
Author
Zuo, Xiang ; Iwahashi, Naoto ; Taguchi, Ryo ; Matsuda, Shigeki ; Sugiura, Komei ; Funakoshi, Kotaro ; Nakano, Mikio ; Oka, Natsuki
Author_Institution
Adv. Telecommun. Res. Labs., Kyoto, Japan
fYear
2010
fDate
14-19 March 2010
Firstpage
2458
Lastpage
2461
Abstract
In this paper, we propose a novel method to detect robot-directed (RD) speech that adopts the Multimodal Semantic Confidence (MSC) measure. The MSC measure is used to decide whether the speech can be interpreted as a feasible action under the current physical situation in an object manipulation task. This measure is calculated by integrating speech, image, and motion confidence measures with weightings that are optimized by logistic regression. Experimental results show that, compared with a baseline method that uses speech confidence only, MSC achieved an absolute increase of 5% for clean speech and 12% for noisy speech in terms of average maximum F-measure.
Keywords
human-robot interaction; motion estimation; object recognition; regression analysis; robots; speech processing; human-robot interaction; image measure; logistic regression; motion measure; multimodal semantic confidence; object manipulation; robot directed speech detection; Acoustic signal detection; Current measurement; Face detection; Gas detectors; Humans; Motion detection; Motion measurement; Robots; Speech analysis; Speech recognition; human-robot interaction; multimodal semantic confidence; robot-directed speech detection;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location
Dallas, TX
ISSN
1520-6149
Print_ISBN
978-1-4244-4295-9
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2010.5494889
Filename
5494889
Link To Document