• DocumentCode
    2937667
  • Title

    Scalable HMM based inference engine in large vocabulary continuous speech recognition

  • Author

    Chong, Jike ; You, Kisun ; Yi, Youngmin ; Gonina, Ekaterina ; Hughes, Christopher ; Sung, Wonyong ; Keutzer, Kurt

  • Author_Institution
    Dept. of Electr. Eng. & Comput. Sci., Univ. of California, Berkeley, CA, USA
  • fYear
    2009
  • fDate
    June 28 2009-July 3 2009
  • Firstpage
    1797
  • Lastpage
    1800
  • Abstract
    Parallel scalability allows an application to efficiently utilize an increasing number of processing elements. In this paper we explore a design space for application scalability for an inference engine in large vocabulary continuous speech recognition (LVCSR). Our implementation of the inference engine involves a parallel graph traversal through an irregular graph-based knowledge network with millions of states and arcs. The challenge is not only to define a software architecture that exposes sufficient fine-grained application concurrency, but also to efficiently synchronize between an increasing number of concurrent tasks and to effectively utilize the parallelism opportunities in today´s highly parallel processors. We propose four application-level implementation alternatives we call ldquoalgorithm stylesrdquo, and construct highly optimized implementations on two parallel platforms: an Intel Core i7 multicore processor and a NVIDIA GTX280 manycore processor. The highest performing algorithm style varies with the implementation platform. On 44 minutes of speech data set, we demonstrate substantial speedups of 3.4times on Core i7 and 10.5times on GTX280 compared to a highly optimized sequential implementation on Core i7 without sacrificing accuracy. The parallel implementations contain less than 2.5% sequential overhead, promising scalability and significant potential for further speedup on future platforms.
  • Keywords
    concurrency control; graph theory; hidden Markov models; parallel algorithms; reasoning about programs; software architecture; speech recognition; speech-based user interfaces; Intel Core i7 multicore processor; NVIDIA GTX280 manycore processor; fine-grained application concurrency; large vocabulary continuous speech recognition; parallel graph traversal; parallel processor; parallel scalability; scalable HMM-based inference engine; sequential programs; software architecture; Application software; Concurrent computing; Engines; Hidden Markov models; Multicore processing; Scalability; Software architecture; Space exploration; Speech recognition; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on
  • Conference_Location
    New York, NY
  • ISSN
    1945-7871
  • Print_ISBN
    978-1-4244-4290-4
  • Electronic_ISBN
    1945-7871
  • Type

    conf

  • DOI
    10.1109/ICME.2009.5202871
  • Filename
    5202871