Title :
A statistical language modeling approach integrating local and global constraints
Author :
Bellegarda, Jerome R.
Author_Institution :
Spoken Language Group, Apple Comput. Inc., Cupertino, CA, USA
Abstract :
A new framework is proposed to integrate the various constraints, both local and global, that are present in language. Local constraints are captured via n-gram language modeling, while global constraints are taken into account through the use of latent semantic analysis. An integrative formulation is derived for the combination of these two paradigms, resulting in several families of multi-span language models for large-vocabulary speech recognition. Because of the inherent complementarity in the two types of constraints, the performance of the integrated language models, as measured by perplexity, compares favorably with the corresponding n-gram performance
Keywords :
constraint theory; modelling; natural languages; nomograms; performance index; speech recognition; statistics; vocabulary; complementarity; global constraints; integrated language models; large-vocabulary speech recognition; latent semantic analysis; local constraints; multi-span language models; n-gram language modeling; performance; perplexity; statistical language modeling; Data mining; Databases; Displays; Frequency; Natural languages; Power measurement; Power system modeling; Predictive models; Speech recognition; Vocabulary;
Conference_Titel :
Automatic Speech Recognition and Understanding, 1997. Proceedings., 1997 IEEE Workshop on
Conference_Location :
Santa Barbara, CA
Print_ISBN :
0-7803-3698-4
DOI :
10.1109/ASRU.1997.659014