DocumentCode
2559547
Title
Recurrent neural network language model in mandarin voice input system
Author
Si, Yujing ; Li, Ta ; Cai, Shang ; Pan, Jielin ; Yan, Yonghong
Author_Institution
Key Lab. of Speech Acoust. & Content Understanding, Beijing, China
fYear
2012
fDate
29-31 May 2012
Firstpage
270
Lastpage
274
Abstract
Over more than three decades, the development of automatic speech recognition (ASR) technology has made it possible for some intelligent query systems to use a voice interface. Specially, voice input system is a practical and interesting application of ASR. In this paper, we present our recent work on using Recurrent Neural Network Language Model (RNNLM) to improve the performance of our Mandarin voice input system. The Mandarin voice input system employs a two-pass strategy. In the first pass, a memory-efficient state network and a tri-gram language model are used to generate the word lattice from which the n-best list is extracted. And, in the second pass, we use a large 4-gram language model and RNNLM to re-rank the n-best list and then output the new best hypothesis. Experiments showed that it was very effective for RNNLM to be used in the n-best list re-score. Eventually, 10.2% relative reduction in word error rate (from 13.7% to 12.3%) was achieved on a voice search task, compared to the result of the first pass.
Keywords
performance evaluation; query processing; recurrent neural nets; speech recognition; speech-based user interfaces; ASR technology; Mandarin voice input system; RNNLM; automatic speech recognition technology; intelligent query systems; large 4-gram language model; memory-efficient state network; n-best list extraction; n-best list re-ranking; n-best list re-score; performance improvement; recurrent neural network language model; tri-gram language model; two-pass strategy; voice interface; voice search task; word error rate reduction; word lattice generation; Acoustics; Computational modeling; Feature extraction; Lattices; Recurrent neural networks; Speech; Speech recognition; Mandarin voice input system; RNNLM; n-best list rescore;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Computation (ICNC), 2012 Eighth International Conference on
Conference_Location
Chongqing
ISSN
2157-9555
Print_ISBN
978-1-4577-2130-4
Type
conf
DOI
10.1109/ICNC.2012.6234689
Filename
6234689
Link To Document