• DocumentCode
    1867374
  • Title

    Vector quantization with memory and multi-labeling for isolated video-only automatic speech recognition

  • Author

    Terry, Louis H. ; Shiell, Derek J. ; Katsaggelos, Aggelos K.

  • Author_Institution
    Dept. of Electr. Eng. & Comput. Sci., Northwestern Univ., Evanston, IL
  • fYear
    2008
  • fDate
    12-15 Oct. 2008
  • Firstpage
    1320
  • Lastpage
    1323
  • Abstract
    We describe a vector quantizer (VQ) with memory for automatic speech recognition (ASR) and compare the recognition performance results to those obtained with traditional memoryless VQ for ASR. Standard VQ for ASR quantizes the speech data independently of any past information. We introduce memory in a probabilistic framework for quantization state modeling. This is accomplished in the form of an ergodic hidden Markov model (HMM) in which the state occupied by the HMM represents the quantization label. We evaluate this approach in the context of video-only isolated digit ASR and implement both single stream (single labeling) and multi-stream (multi-labeling) systems. For single stream recognition, our approach increases the recognition rate from 62.67% to 66.95%. When using multi-labeling, our proposed vector quantizer with memory consistently outperforms the memoryless vector quantizer.
  • Keywords
    hidden Markov models; probability; speech recognition; vector quantisation; ergodic hidden Markov model; isolated video-only automatic speech recognition; memoryless VQ; memoryless vector quantizer; multilabeling systems; multistream systems; probabilistic framework; quantization state modeling; recognition performance; single labeling systems; single stream recognition; single stream systems; vector quantization; video-only isolated digit ASR; Acceleration; Application software; Automatic speech recognition; Feature extraction; Hidden Markov models; Labeling; Speech recognition; Streaming media; Telephony; Vector quantization; Hidden Markov Models; Speech Recognition; Vector Quantization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on
  • Conference_Location
    San Diego, CA
  • ISSN
    1522-4880
  • Print_ISBN
    978-1-4244-1765-0
  • Electronic_ISBN
    1522-4880
  • Type

    conf

  • DOI
    10.1109/ICIP.2008.4712006
  • Filename
    4712006