Title :
A 40 nm 144 mW VLSI Processor for Real-Time 60-kWord Continuous Speech Recognition
Author :
He, Guangji ; Sugahara, Takanobu ; Miyamoto, Yuki ; Fujinaga, Tsuyoshi ; Noguchi, Hiroki ; Izumi, Shintaro ; Kawaguchi, Hiroshi ; Yoshimoto, Masahiko
Author_Institution :
Grad. Sch. of Syst. Inf., Kobe Univ., Kobe, Japan
Abstract :
We have developed a low-power VLSI chip for 60-kWord real-time continuous speech recognition based on a context-dependent hidden Markov model (HMM). Our implementation includes a cache architecture using locality of speech recognition, beam pruning using a dynamic threshold, two-stage language model searching, highly parallel Gaussian mixture model (GMM) computation based on the mixture level, a variable-frame look-ahead scheme, and elastic pipeline operation between the Viterbi transition and GMM processing. The accuracy degradation of the important parameters in Viterbi computation is strictly discussed. Results show that our implementation achieves 95% bandwidth reduction (70.86 MB/s) and 78% required frequency reduction (126.5 MHz) comparing to the referential Julius system. The test chip, fabricated using 40 nm CMOS technology, contains 1.9 M transistors for logic and 7.8 Mbit on-chip memory. It dissipates 144 mW at 126.5 MHz and 1.1 V for 60-kWord real-time continuous speech recognition.
Keywords :
CMOS integrated circuits; Gaussian processes; VLSI; hidden Markov models; microprocessor chips; speech recognition; CMOS technology; VLSI processor; Viterbi transition; beam pruning; cache architecture; context dependent hidden Markov model; dynamic threshold; elastic pipeline operation; frequency 126.5 MHz; frequency reduction; on-chip memory; parallel Gaussian mixture model; power 144 mW; real time continuous speech recognition; referential Julius system; size 40 nm; two stage language model searching; variable frame look ahead scheme; voltage 1.1 V; Accuracy; Bandwidth; Computational modeling; Hidden Markov models; Speech recognition; Very large scale integration; Viterbi algorithm; 40 nm VLSI; hidden Markov model (HMM); large vocabulary continuous speech recognition (LVCSR); memory bandwidth reduction;
Journal_Title :
Circuits and Systems I: Regular Papers, IEEE Transactions on
DOI :
10.1109/TCSI.2012.2206501