• DocumentCode
    179351
  • Title

    Query-based composition for large-scale language model in LVCSR

  • Author

    Yang Han ; Chenwei Zhang ; Xiangang Li ; Yi Liu ; Xihong Wu

  • Author_Institution
    Speech & Hearing Res. Center, Peking Univ., Beijing, China
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    4898
  • Lastpage
    4902
  • Abstract
    This paper describes a query-based composition algorithm that can integrate an ARPA format language model in the unified WFST framework, which avoids the memory and time cost of converting the language models to WFST and optimizing the WFST of language models. The proposed algorithm is applied to on-the-fly one-pass decoder and rescoring decoder. Both modified decoder require less memory during decoding on different scale of language models. What´s more, query-based on-the-fly one-pass decoder nearly has the same decoding speed as standard one and query-based rescoring decoder even use less time to rescore the lattice. Because of these advantages, large-scale language models can be applied by query-based composition algorithm to improve the performance of large vocabulary continuous speech recognition.
  • Keywords
    decoding; query processing; speech recognition; transducers; vocabulary; ARPA format language model; LVCSR; decoding speed; large vocabulary continuous speech recognition; large-scale language model; on-the-fly one-pass decoder; query-based composition; rescoring decoder; unified WFST framework; weighted finite-state transducer; Decoding; Hidden Markov models; Lattices; Memory management; Speech; Speech recognition; Standards; WFST; composition; large-scale language model; query-based; speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6854533
  • Filename
    6854533