DocumentCode
730849
Title
Token-level interpolation for class-based language models
Author
Levit, Michael ; Stolcke, Andreas ; Shuangyu Chang ; Parthasarathy, Sarangarajan
Author_Institution
Microsoft Corp., Redmond, WA, USA
fYear
2015
fDate
19-24 April 2015
Firstpage
5426
Lastpage
5430
Abstract
We describe a method for interpolation of class-based n-gram language models. Our algorithm is an extension of the traditional EM-based approach that optimizes perplexity of the training set with respect to a collection of n-gram language models linearly combined in the probability space. However, unlike prior work, it naturally supports context-dependent interpolation for class-based LMs. In addition, the method works naturally with the recently introduced wordphrase- entity (WPE) language models that unify words, phrases and entities into a single statistical framework. Applied to the Calendar scenario of the Personal Assistant domain, our method achieved significant perplexity reduction and improved word error rates.
Keywords
expectation-maximisation algorithm; interpolation; natural language processing; probability; speech processing; statistical analysis; EM-based approach; WPE language models; calendar scenario; class-based LM; class-based n-gram language models; context-dependent interpolation; expectation-maximisation approach; perplexity reduction; personal assistant domain; probability space; token-level interpolation; word error rates; wordphrase-entity language models; Adaptation models; Computational modeling; Context; Context modeling; Interpolation; Probability; Training; class-based language models; context-dependent interpolation; language model interpolation;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7179008
Filename
7179008
Link To Document