Multi-user real-time speech recognition with a GPU

Author

Kim, Jungsuk ; Sung, Wonyong

Author_Institution

Dept. of Electr. Eng. & Comput. Sci., Seoul Nat. Univ., Seoul, South Korea

fYear

2012

fDate

25-30 March 2012

Firstpage

1617

Lastpage

1620

Abstract

We have developed a multi-user large vocabulary speech recognition system employing a fully composed one-level weighted finite state transducer (WFST) based network on a Graphics Processing Unit (GPU). This system improves the overall throughput and latency of speech recognition engine which processes multiple users´ utterances at the same time with efficient scheduling, parameter sharing, and communication overhead reduction techniques. We conduct both batch speech simulation and trace driven online simulation to access the performance of the developed system. Traces are generated based on a queueing model.

Keywords

graphics processing units; queueing theory; speech recognition; GPU; batch speech simulation; communication overhead reduction technique; graphics processing unit; multiuser large vocabulary speech recognition system; multiuser real-time speech recognition; parameter sharing; queueing model; speech recognition engine; trace driven online simulation; weighted finite state transducer based network; Acoustics; Engines; Graphics processing unit; Hidden Markov models; Servers; Speech; Speech recognition; Distributed Speech Recognition; GPU; LVCSR; Speech recognition; WFST;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Conference_Location

Kyoto

ISSN

1520-6149

Print_ISBN

978-1-4673-0045-2

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2012.6288204

Filename

6288204