• DocumentCode
    968193
  • Title

    A method for the construction of acoustic Markov models for words

  • Author

    Bahl, L.R. ; Brown, P.F. ; de Souza, P.V. ; Mercer, R.L. ; Picheny, M.A.

  • Author_Institution
    Speech Recognition Group, IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
  • Volume
    1
  • Issue
    4
  • fYear
    1993
  • fDate
    10/1/1993 12:00:00 AM
  • Firstpage
    443
  • Lastpage
    452
  • Abstract
    A technique for constructing Markov models for the acoustic representation of words is described. Word models are constructed from models of subword units called fenones. Fenones represent very short speech events and are obtained automatically through the use of a vector quantizer. The fenonic baseform for a word-i.e., the sequence of fenones used to represent the word-is derived automatically from one or more utterances of that word. Since the word models are all composed from a small inventory of subword models, training for large-vocabulary speech recognition systems can be accomplished with a small training script. A method for combining phonetic and fenonic models is presented. Results of experiments with speaker-dependent and speaker-independent models on several isolated-word recognition tasks are reported. The results are compared with those for phonetics-based Markov models and template-based dynamic programming (DP) matching
  • Keywords
    Markov processes; speech recognition; vector quantisation; acoustic Markov models; acoustic representation; fenones; isolated-word recognition; large-vocabulary speech recognition systems; phonetic models; small training script; speaker-dependent models; speaker-independent models; subword units; vector quantizer; word models; Degradation; Error analysis; Helium; Natural languages; Parameter estimation; Speech recognition; Training data; Vocabulary;
  • fLanguage
    English
  • Journal_Title
    Speech and Audio Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6676
  • Type

    jour

  • DOI
    10.1109/89.242490
  • Filename
    242490