Title :
Pruning exponential language models
Author :
Chen, Stanley F. ; Sethy, Abhinav ; Ramabhadran, Bhuvana
Author_Institution :
IBM T.J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
Language model pruning is an essential technology for speech applications running on resource-constrained devices, and many pruning algorithms have been developed for conventional word n-gram models. However, while exponential language models can give superior performance, there has been little work on the pruning of these models. In this paper, we propose several pruning algorithms for general exponential language models. We show that our best algorithm applied to an exponential n-gram model outperforms existing n-gram model pruning algorithms by up to 0.4% absolute in speech recognition word-error rate on Wall Street Journal and Broadcast News data sets. In addition, we show that Model M, an exponential class-based language model, retains its performance improvement over conventional word n-gram models when pruned to equal size, with gains of up to 2.5% absolute in word-error rate.
Keywords :
natural languages; speech recognition; Broadcast News; Wall Street Journal; exponential language model pruning; general exponential language model; pruning algorithms; resource constrained device; speech applications; speech recognition word error rate; word n-gram model; Adaptation models; Computational modeling; Data models; Entropy; Mathematical model; Smoothing methods; Training;
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
Conference_Location :
Waikoloa, HI
Print_ISBN :
978-1-4673-0365-1
Electronic_ISBN :
978-1-4673-0366-8
DOI :
10.1109/ASRU.2011.6163937