DocumentCode :
312331
Title :
Word predictability after hesitations: a corpus-based study
Author :
Shriberg, Elizabeth ; Stolcke, Andreas
Author_Institution :
Speech Technol. & Res. Lab., SRI Int., Menlo Park, CA, USA
Volume :
3
fYear :
1996
fDate :
3-6 Oct 1996
Firstpage :
1868
Abstract :
Asks whether lexical hesitations in spontaneous speech tend to precede words that are difficult to predict. We define predictability in terms of both transition probability and entropy, in the context of an N-gram language model. The results show that the transition probability is significantly lower at hesitation transitions, and that this is attributable to both the following word and the word history. In addition, the results suggest that fluent transitions in sentences with a hesitation elsewhere are significantly more likely to contain out-of-vocabulary words and novel word combinations. Such findings could be used to improve statistical language modeling for spontaneous speech applications
Keywords :
entropy; linguistics; nomograms; probability; psychology; speech; N-gram language model; corpus-based study; entropy; fluent transitions; following word; hesitation transitions; lexical hesitations; novel word combinations; out-of-vocabulary words; sentences; spontaneous speech; statistical language modeling; transition probability; word history; word predictability; Context modeling; Entropy; History; Humans; Laboratories; Natural languages; Predictive models; Probability; Speech; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
Type :
conf
DOI :
10.1109/ICSLP.1996.607996
Filename :
607996
Link To Document :
بازگشت