DocumentCode :
302311
Title :
Statistical language modeling for speech disfluencies
Author :
Stolcke, Andreas ; Shriberg, Elizabeth
Author_Institution :
Speech Technol. & Res. Lab., SRI Int., Menlo Park, CA, USA
Volume :
1
fYear :
1996
fDate :
7-10 May 1996
Firstpage :
405
Abstract :
Speech disfluencies (such as filled pauses, repetitions, restarts) are among the characteristics distinguishing spontaneous speech from planned or read speech. We introduce a language model that predicts disfluencies probabilistically and uses an edited, fluent context to predict following words. The model is based on a generalization of the standard N-gram language model. It uses dynamic programming to compute the probability of a word sequence, taking into account possible hidden disfluency events. We analyze the model´s performance for various disfluency types on the Switchboard corpus. We find that the model reduces the word perplexity in the neighborhood of disfluency events; however, overall differences are small and have no significant impact on the recognition accuracy. We also note that for modeling of the most frequent type of disfluency, filled pauses, a segmentation of utterances into linguistic (rather than acoustic) units is required. Our analysis illustrates a generally useful technique for language model evaluation based on local perplexity comparisons
Keywords :
dynamic programming; grammars; natural languages; probability; speech processing; speech recognition; statistical analysis; Switchboard corpus; acoustic units; dynamic programming; edited fluent context; filled pauses; fluent context; language model; language model evaluation; linguistic units; performance; probability; read speech; recognition accuracy; repetitions; restarts; speech disfluencies; spontaneous speech; standard N-gram language model; statistical language modeling; utterances segmentation; word perplexity; word sequence; Context modeling; Decoding; Error analysis; Laboratories; Natural languages; Performance analysis; Predictive models; Speech recognition; Standards development;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on
Conference_Location :
Atlanta, GA
ISSN :
1520-6149
Print_ISBN :
0-7803-3192-3
Type :
conf
DOI :
10.1109/ICASSP.1996.541118
Filename :
541118
Link To Document :
بازگشت