• DocumentCode
    2800527
  • Title

    Discriminative template extraction for direct modeling

  • Author

    Shivappa, Shankar ; Nguyen, Patrick ; Zweig, Geoffrey

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Univ. of California, San Diego, La Jolla, CA, USA
  • fYear
    2010
  • fDate
    14-19 March 2010
  • Firstpage
    4338
  • Lastpage
    4341
  • Abstract
    This paper addresses the problem of developing appropriate features for use in direct modeling approaches to speech recognition, such as those based on Maximum Entropy models or Segmental Conditional Random Fields. We propose a feature based on the detection of word-level templates which are discriminatively chosen based on a mutual information criterion. The templates for a word are derived directly from the MFCC feature vectors, based on self-similarity across examples. No pronunciation dictionary is used, and the resulting templates match closely to in-class examples and distantly to out-of-class examples. We utilize template detection events as input to a segmental CRF speech recognizer. We evaluate the entire scheme on a voice search task. The results show that the use of discriminative template based word detector streams improves the speech recognizer´s performance over the baseline HMM results.
  • Keywords
    cepstral analysis; feature extraction; hidden Markov models; maximum entropy methods; random processes; speech recognition; HMM; MFCC feature vectors; Mel frequency cepstral coefficients; direct modeling approaches; discriminative template extraction; feature detection; maximum entropy models; segmental CRF speech recognizer; segmental conditional random fields; speech recognition; voice search task; word-level templates; Data mining; Decoding; Detectors; Dictionaries; Entropy; Event detection; Hidden Markov models; Mel frequency cepstral coefficient; Mutual information; Speech recognition; Discriminative Templates; Segmental Conditional Random Fields; Speech Recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-4295-9
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2010.5495647
  • Filename
    5495647