DocumentCode :
1420403
Title :
A multispan language modeling framework for large vocabulary speech recognition
Author :
Bellegarda, Jerome R.
Author_Institution :
Spoken Language Group, Apple Comput. Inc., Cupertino, CA, USA
Volume :
6
Issue :
5
fYear :
1998
fDate :
9/1/1998 12:00:00 AM
Firstpage :
456
Lastpage :
467
Abstract :
A new framework is proposed to construct multispan language models for large vocabulary speech recognition, by exploiting both local and global constraints present in the language. While statistical n-gram modeling can readily take local constraints into account, global constraints have been more difficult to handle within a data-driven formalism. In this work, they are captured via a paradigm first formulated in the context of information retrieval, called latent semantic analysis (LSA). This paradigm seeks to automatically uncover the salient semantic relationships between words and documents in a given corpus. Such discovery relies on a parsimonious vector representation of each word and each document in a suitable, common vector space. Since in this space familiar clustering techniques can be applied, it becomes possible to derive several families of large-span language models, with various smoothing properties. Because of their semantic nature, the new language models are well suited to complement conventional, more syntactically oriented n-grams, and the combination of the two paradigms naturally yields the benefit of a multispan context. An integrative formulation is proposed for this purpose, in which the latent semantic information is used to adjust the standard n-gram probability. The performance of the resulting multispan language models, as measured by perplexity, compares favorably with the corresponding n-gram performance
Keywords :
grammars; information retrieval; natural languages; probability; speech recognition; statistical analysis; clustering techniques; data-driven formalism; documents; global constraints; information retrieval; integrative formulation; large vocabulary speech recognition; large-span language models; latent semantic analysis; local constraints; multispan language modeling; n-gram performance; n-gram probability; perplexity; semantic relationships; smoothing properties; statistical n-gram modeling; vector representation; vector space; words; Acoustic applications; Context modeling; Frequency estimation; Information analysis; Information retrieval; Natural languages; Probability; Smoothing methods; Speech recognition; Vocabulary;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.709671
Filename :
709671
Link To Document :
بازگشت