DocumentCode :
542287
Title :
Recent advances in efficient decoding combining on-line transducer composition and smoothed language model incorporation
Author :
Willett, Daniel ; Katagir, Shigeru
Author_Institution :
Speech Open Lab, NTT Communication Science Laboratories, NTT Corporation, 2-4, Hikaridai, Seika-cho, Soraku-gun, Kyoto, Japan
Volume :
1
fYear :
2002
fDate :
13-17 May 2002
Abstract :
This paper presents and evaluates our recent efforts on efficient decoding for Large Vocabulary Continuous Speech Recognition in the framework of Weighted Finite State Transducers. We evaluate on-the-fly transducer composition for reduced memory consumption combined with weight smearing for a more time-synchronous language model incorporation. It turns out that in the on-line composition mode weight smoothing within the static part of the network is even more beneficial on run-time to accuracy ratio than in the fully precompiled case. Evaluations are carried out on a state-of-the-art recognition system of 10k words, cross-word triphone acoustic models and trigram language model. In this scenario, the Viterbi-search is carried out fully time-synchronously in only a single pass. The combination of on-the-fly network composition with only the unigram part of the language model smoothly compiled into the network achieves a remarkably good run-time to accuracy ratio with only moderate memory requirements.
Keywords :
Adaptation model; Argon; Artificial neural networks; Hidden Markov models; Minimization; Smoothing methods;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location :
Orlando, FL, USA
ISSN :
1520-6149
Print_ISBN :
0-7803-7402-9
Type :
conf
DOI :
10.1109/ICASSP.2002.5743817
Filename :
5743817
Link To Document :
بازگشت