DocumentCode :
2285256
Title :
Audio-visual intent-to-speak detection for human-computer interaction
Author :
De Cuetos, Philippe ; Neti, Chalapathy ; Senior, Andrew W.
Author_Institution :
Inst. Eurecom, Sophia-Antipolis, France
Volume :
6
fYear :
2000
fDate :
2000
Firstpage :
2373
Abstract :
Introduces a practical system that aims to detect a user´s intent to speak to a computer, by considering both audio and visual cues. The whole system is designed to intuitively turn on the microphone for speech recognition without needing to click on a mouse, thus improving the human-like communication between users and computers. The first step is to detect a frontal face through a simple desktop video camera image, by using some well-known image processing techniques for face and facial feature detection on one image. The second step is an audio-visual speech event detection that combines both visual and audio indications of speech. In this paper, we consider visual measures of speech activity as well as audio energy to determine if the previously detected user is actually speaking or not
Keywords :
audio-visual systems; face recognition; feature extraction; microphones; speech recognition; speech-based user interfaces; audio cues; audio energy; audiovisual speech event detection; desktop video camera image; facial feature detection; frontal face detection; human-computer interaction; human-like communication; image processing techniques; intent-to-speak detection; microphone; speech activity measures; speech recognition; visual cues; visual measures; Face detection; Humans; Keyboards; Mice; Mouth; Shape; Speech recognition; Text processing; USA Councils;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location :
Istanbul
ISSN :
1520-6149
Print_ISBN :
0-7803-6293-4
Type :
conf
DOI :
10.1109/ICASSP.2000.859318
Filename :
859318
Link To Document :
بازگشت