• DocumentCode
    1966547
  • Title

    A 40 nm 144 mW VLSI processor for realtime 60 kWord continuous speech recognition

  • Author

    He, Guangji ; Sugahara, Takanobu ; Fujinaga, Tsuyoshi ; Miyamoto, Yuki ; Noguchi, Hiroki ; Izumi, Shintaro ; Kawaguchi, Hiroshi ; Yoshimoto, Masahiko

  • Author_Institution
    Kobe Univ., Kobe, Japan
  • fYear
    2011
  • fDate
    19-21 Sept. 2011
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    We have developed a low-power VLSI chip for 60-kWord real-time continuous speech recognition based on a Hidden Markov Model (HMM). Our implementation includes a cache architecture using locality of speech recognition, beam pruning using a dynamic threshold, two-stage language model searching, highly parallel Gaussian Mixture Model (GMM) computation based on the mixture level, a variable 50-frame look-ahead scheme and elastic pipeline operation between the Viterbi transition and GMM processing. Results show that our implementation achieves 95% bandwidth reduction (70.86 MB/s) and 78% required frequency reduction (126.5 MHz) for 60-kWord real-time continuous speech recognition. The test chip, fabricated using 40 nm CMOS technology and containing 1.9 M transistors for logic and 7.8 Mbit on-chip memory, occupies 2.2 mm × 2.5 mm area. Measured data show 144 mW power consumption at 126.5 MHz and 1.1 V.
  • Keywords
    CMOS integrated circuits; Gaussian processes; VLSI; hidden Markov models; speech recognition; CMOS technology; HMM; VLSI processor; Viterbi transition; cache architecture; dynamic threshold; elastic pipeline operation; frequency 126.5 MHz; hidden Markov model; highly parallel Gaussian mixture model computation; mixture level; on-chip memory; power 144 mW; realtime continuous speech recognition; size 40 nm; speech recognition locality; two-stage language model searching; variable 50-frame look-ahead scheme; voltage 1.1 V; Accuracy; Bandwidth; Computer architecture; Hidden Markov models; Real time systems; Speech recognition; Viterbi algorithm; 40 nm VLSI; hidden Markov model (HMM); large vocabulary continuous speech recognition (LVCSR);
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Custom Integrated Circuits Conference (CICC), 2011 IEEE
  • Conference_Location
    San Jose, CA
  • ISSN
    0886-5930
  • Print_ISBN
    978-1-4577-0222-8
  • Type

    conf

  • DOI
    10.1109/CICC.2011.6055412
  • Filename
    6055412