Title :
The effect of language model probability on pronunciation reduction
Author :
Jurafsky, Daniel ; Bell, Alan ; Gregory, Mark ; Raymond, William D.
Author_Institution :
Linguistics Dept., Colorado Univ., Boulder, CO, USA
Abstract :
We investigate how the probability of a word affects its pronunciation. We examined 5618 tokens of the 10 most frequent (function) words in Switchboard and 2042 tokens of content words whose lexical form ends in a t or d. Our observations were drawn from the phonetically hand-transcribed subset of the Switchboard corpus, enabling us to code each word with its pronunciation and duration. Using linear and logistic regression to control for contextual factors, we show that words which have a high unigram, bigram, or reverse bigram (given the following word) probability are shorter, more likely to have a reduced vowel, and more likely to have a deleted final t or d. These results suggest that pronunciation models in speech recognition and synthesis should take into account word probability given both the previous and following words, for both content and function words
Keywords :
linguistics; probability; speech recognition; speech synthesis; statistical analysis; Switchboard corpus; contextual factors; high unigram probability; language model probability; lexical form; linear regression; logistic regression; pronunciation reduction; reverse bigram probability; speech recognition; speech synthesis; word probability; Collaborative work; Frequency; Logistics; Natural languages; Predictive models; Probability distribution; Regression analysis; Speech recognition; Speech synthesis;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location :
Salt Lake City, UT
Print_ISBN :
0-7803-7041-4
DOI :
10.1109/ICASSP.2001.941036