Title :
Word predictability after hesitations: a corpus-based study
Author :
Shriberg, Elizabeth ; Stolcke, Andreas
Author_Institution :
Speech Technol. & Res. Lab., SRI Int., Menlo Park, CA, USA
Abstract :
Asks whether lexical hesitations in spontaneous speech tend to precede words that are difficult to predict. We define predictability in terms of both transition probability and entropy, in the context of an N-gram language model. The results show that the transition probability is significantly lower at hesitation transitions, and that this is attributable to both the following word and the word history. In addition, the results suggest that fluent transitions in sentences with a hesitation elsewhere are significantly more likely to contain out-of-vocabulary words and novel word combinations. Such findings could be used to improve statistical language modeling for spontaneous speech applications
Keywords :
entropy; linguistics; nomograms; probability; psychology; speech; N-gram language model; corpus-based study; entropy; fluent transitions; following word; hesitation transitions; lexical hesitations; novel word combinations; out-of-vocabulary words; sentences; spontaneous speech; statistical language modeling; transition probability; word history; word predictability; Context modeling; Entropy; History; Humans; Laboratories; Natural languages; Predictive models; Probability; Speech; Testing;
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
DOI :
10.1109/ICSLP.1996.607996