Title :
Token-level interpolation for class-based language models
Author :
Levit, Michael ; Stolcke, Andreas ; Shuangyu Chang ; Parthasarathy, Sarangarajan
Author_Institution :
Microsoft Corp., Redmond, WA, USA
Abstract :
We describe a method for interpolation of class-based n-gram language models. Our algorithm is an extension of the traditional EM-based approach that optimizes perplexity of the training set with respect to a collection of n-gram language models linearly combined in the probability space. However, unlike prior work, it naturally supports context-dependent interpolation for class-based LMs. In addition, the method works naturally with the recently introduced wordphrase- entity (WPE) language models that unify words, phrases and entities into a single statistical framework. Applied to the Calendar scenario of the Personal Assistant domain, our method achieved significant perplexity reduction and improved word error rates.
Keywords :
expectation-maximisation algorithm; interpolation; natural language processing; probability; speech processing; statistical analysis; EM-based approach; WPE language models; calendar scenario; class-based LM; class-based n-gram language models; context-dependent interpolation; expectation-maximisation approach; perplexity reduction; personal assistant domain; probability space; token-level interpolation; word error rates; wordphrase-entity language models; Adaptation models; Computational modeling; Context; Context modeling; Interpolation; Probability; Training; class-based language models; context-dependent interpolation; language model interpolation;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
DOI :
10.1109/ICASSP.2015.7179008