DocumentCode :
2937667
Title :
Scalable HMM based inference engine in large vocabulary continuous speech recognition
Author :
Chong, Jike ; You, Kisun ; Yi, Youngmin ; Gonina, Ekaterina ; Hughes, Christopher ; Sung, Wonyong ; Keutzer, Kurt
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Univ. of California, Berkeley, CA, USA
fYear :
2009
fDate :
June 28 2009-July 3 2009
Firstpage :
1797
Lastpage :
1800
Abstract :
Parallel scalability allows an application to efficiently utilize an increasing number of processing elements. In this paper we explore a design space for application scalability for an inference engine in large vocabulary continuous speech recognition (LVCSR). Our implementation of the inference engine involves a parallel graph traversal through an irregular graph-based knowledge network with millions of states and arcs. The challenge is not only to define a software architecture that exposes sufficient fine-grained application concurrency, but also to efficiently synchronize between an increasing number of concurrent tasks and to effectively utilize the parallelism opportunities in today´s highly parallel processors. We propose four application-level implementation alternatives we call ldquoalgorithm stylesrdquo, and construct highly optimized implementations on two parallel platforms: an Intel Core i7 multicore processor and a NVIDIA GTX280 manycore processor. The highest performing algorithm style varies with the implementation platform. On 44 minutes of speech data set, we demonstrate substantial speedups of 3.4times on Core i7 and 10.5times on GTX280 compared to a highly optimized sequential implementation on Core i7 without sacrificing accuracy. The parallel implementations contain less than 2.5% sequential overhead, promising scalability and significant potential for further speedup on future platforms.
Keywords :
concurrency control; graph theory; hidden Markov models; parallel algorithms; reasoning about programs; software architecture; speech recognition; speech-based user interfaces; Intel Core i7 multicore processor; NVIDIA GTX280 manycore processor; fine-grained application concurrency; large vocabulary continuous speech recognition; parallel graph traversal; parallel processor; parallel scalability; scalable HMM-based inference engine; sequential programs; software architecture; Application software; Concurrent computing; Engines; Hidden Markov models; Multicore processing; Scalability; Software architecture; Space exploration; Speech recognition; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on
Conference_Location :
New York, NY
ISSN :
1945-7871
Print_ISBN :
978-1-4244-4290-4
Electronic_ISBN :
1945-7871
Type :
conf
DOI :
10.1109/ICME.2009.5202871
Filename :
5202871
Link To Document :
بازگشت