Title :
A hybrid barge-in procedure for more reliable turn-taking in human-machine dialog systems
Author :
Rose, Richard C. ; Kim, Hong Kook
Author_Institution :
AT&T Labs.-Res., USA
fDate :
30 Nov.-3 Dec. 2003
Abstract :
This paper investigates techniques designed to allow the users of human-machine dialog systems to interrupt or barge-in over machine generated speech messages. An experimental study was performed on utterances collected from a telephone based dialog system to analyze the effect of barge-in performance on users´ speech. One result of this study was that excessive barge-in latencies resulted in disfluencies appearing in over half of users´ utterances. A hybrid procedure for barge-in detection is proposed and evaluated on the utterances collected from the same domain. The procedure combines a feature-based voice activity detection (VAD) algorithm with a model-based approach for verifying hypothesized speech segments. The procedure is shown in the paper to obtain better detection performance than procedures that rely on the speech recognition decoder to detect speech. It is also found to have latencies that are comparable to those obtained by low delay feature-based speech detection algorithms.
Keywords :
feature extraction; speech processing; speech-based user interfaces; barge-in latency; feature-based voice activity detection; human-machine dialog systems; hybrid barge-in procedure; hypothesized speech segment verification; low delay speech detection; machine generated speech message interruption; speech analysis; turn-taking; user utterance disfluencies; Acoustic signal detection; Automatic speech recognition; Computer vision; Decoding; Delay; Event detection; Man machine systems; Protocols; Speech analysis; Telephony;
Conference_Titel :
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN :
0-7803-7980-2
DOI :
10.1109/ASRU.2003.1318428