• DocumentCode
    2973796
  • Title

    A multiplatform speech recognition decoder based on weighted finite-state transducers

  • Author

    Stoimenov, Emilian ; Schultz, Tanja

  • Author_Institution
    Cognitive Syst. Labs., Univ. of Karlsruhe, Karlsruhe, Germany
  • fYear
    2009
  • fDate
    Nov. 13 2009-Dec. 17 2009
  • Firstpage
    293
  • Lastpage
    298
  • Abstract
    Speech recognition decoders based on static graphs have recently proven to significantly outperform the traditional approach of prefix tree expansion in terms of decoding speed. The reduced search effort makes static graph decoders an attractive alternative for tasks concerned with limited processing power or memory footprint on devices such as PDAs, internet tablets, and smart phones. In this paper we explore the benefits of decoding with an optimized speech recognition network over the fully task-optimized prefix-tree based decoder IBIS. We designed and implemented a new decoder called SWIFT (speedy weigthed finite-state transducer) based on WFSTs with its application to embedded platforms in mind. After describing the design, the network construction and storage process, we present evaluation results on a small task suitable for embedded applications, and on a large task, namely the European Parliament Plenary Sessions (EPPS) task from the TC-STAR project. The SWIFT Decoder is up to 50% faster than IBIS on both tasks. In addition, SWIFT achieves significant memory consumption reductions obtained by our innovative network specific storage layout optimization.
  • Keywords
    decoding; speech coding; speech recognition; European Parliament Plenary Sessions; PDA; internet tablets; multiplatform speech recognition decoder; network construction; prefix tree expansion; smart phones; speedy weigthed finite-state transducer; static graph decoders; storage process; weighted finite-state transducers; Acoustic testing; Context modeling; Decoding; Fixed-point arithmetic; Internet; Personal digital assistants; Smart phones; Speech recognition; Transducers; Tree graphs;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
  • Conference_Location
    Merano
  • Print_ISBN
    978-1-4244-5478-5
  • Electronic_ISBN
    978-1-4244-5479-2
  • Type

    conf

  • DOI
    10.1109/ASRU.2009.5373404
  • Filename
    5373404