Novel CI-backoff scheme for real-time embedded speech recognition

Author

Ma, Tao ; Deisher, Michael

Author_Institution

Mississippi State Univ., Starkville, MS, USA

fYear

2010

fDate

14-19 March 2010

Firstpage

1614

Lastpage

1617

Abstract

A new method for reduction of computation and memory bandwidth for embedded large vocabulary continuous speech recognition is presented. During the Hidden Markov model state likelihood computation, scores for selected context-dependent (triphone) model states are computed for several frames in advance. Scores that are subsequently needed for Viterbi search but not found in the buffer are replaced by the scores for associated context independent (monophone) models. On the Wall Street Journal 20,000 word continuous speech recognition task, an overall reduction of 58% memory bandwidth and decrease of 23% execution time is achieved relative to an assembly optimized implementation of Sphinx 3. Recognition accuracy is reduced by <;1% while recognition latency is increased by 30 milliseconds.

Keywords

embedded systems; hidden Markov models; maximum likelihood estimation; speech recognition; vocabulary; CI-backoff scheme; Hidden Markov model; Viterbi search; context independent models; context-dependent model; real-time embedded system; vocabulary continuous speech recognition; Bandwidth; Context modeling; Decoding; Delay; Embedded computing; Hidden Markov models; Load modeling; Probability density function; Speech recognition; Vocabulary; HMM; LVCSR; acoustic modeling; backoff; speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on

Conference_Location

Dallas, TX

ISSN

1520-6149

Print_ISBN

978-1-4244-4295-9

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2010.5494887

Filename

5494887