• DocumentCode
    2559547
  • Title

    Recurrent neural network language model in mandarin voice input system

  • Author

    Si, Yujing ; Li, Ta ; Cai, Shang ; Pan, Jielin ; Yan, Yonghong

  • Author_Institution
    Key Lab. of Speech Acoust. & Content Understanding, Beijing, China
  • fYear
    2012
  • fDate
    29-31 May 2012
  • Firstpage
    270
  • Lastpage
    274
  • Abstract
    Over more than three decades, the development of automatic speech recognition (ASR) technology has made it possible for some intelligent query systems to use a voice interface. Specially, voice input system is a practical and interesting application of ASR. In this paper, we present our recent work on using Recurrent Neural Network Language Model (RNNLM) to improve the performance of our Mandarin voice input system. The Mandarin voice input system employs a two-pass strategy. In the first pass, a memory-efficient state network and a tri-gram language model are used to generate the word lattice from which the n-best list is extracted. And, in the second pass, we use a large 4-gram language model and RNNLM to re-rank the n-best list and then output the new best hypothesis. Experiments showed that it was very effective for RNNLM to be used in the n-best list re-score. Eventually, 10.2% relative reduction in word error rate (from 13.7% to 12.3%) was achieved on a voice search task, compared to the result of the first pass.
  • Keywords
    performance evaluation; query processing; recurrent neural nets; speech recognition; speech-based user interfaces; ASR technology; Mandarin voice input system; RNNLM; automatic speech recognition technology; intelligent query systems; large 4-gram language model; memory-efficient state network; n-best list extraction; n-best list re-ranking; n-best list re-score; performance improvement; recurrent neural network language model; tri-gram language model; two-pass strategy; voice interface; voice search task; word error rate reduction; word lattice generation; Acoustics; Computational modeling; Feature extraction; Lattices; Recurrent neural networks; Speech; Speech recognition; Mandarin voice input system; RNNLM; n-best list rescore;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Computation (ICNC), 2012 Eighth International Conference on
  • Conference_Location
    Chongqing
  • ISSN
    2157-9555
  • Print_ISBN
    978-1-4577-2130-4
  • Type

    conf

  • DOI
    10.1109/ICNC.2012.6234689
  • Filename
    6234689