• DocumentCode
    2998615
  • Title

    A segment model based approach to speech recognition

  • Author

    Lee, Chin-Hui ; Soong, Frank K. ; Juang, Biing-hwang

  • Author_Institution
    AT&T Bell Lab., Murray Hill, NJ, USA
  • fYear
    1988
  • fDate
    11-14 Apr 1988
  • Firstpage
    501
  • Abstract
    Proposes a global acoustic segment model for characterizing fundamental speech sound units and their interactions based upon a general framework of hidden Markov models (HMM). Each segment model represents a class of acoustically similar sounds. The intra-segment variability of each sound class is modeled by an HMM, and the sound-to-sound transition rules are characterized by a probabilistic intersegment transition matrix. An acoustically-derived lexicon is used to construct word models based upon subword segment models. The proposed segment model was tested on a speaker-trained, isolated word, speech recognition task with a vocabulary of 1109 basic English words. In the current study, only 128 segment models were used, and recognition was performed by optimally aligning the test utterance with all acoustic lexicon entries using a maximum likelihood Viterbi decoding algorithm. Based upon a database of three male speakers, the average word recognition accuracy for the top candidate was 85% and increased to 96% and 98% for the top 3 and top 5 candidates, respectively
  • Keywords
    Markov processes; acoustic signal processing; decoding; speech analysis and processing; speech recognition; English words; acoustically-derived lexicon; database; global acoustic segment model; hidden Markov models; isolated word speech recognition; maximum likelihood Viterbi decoding algorithm; probabilistic intersegment transition matrix; sound-to-sound transition rules; speaker trained speech recognition; speech sound units; subword segment models; word models; word recognition accuracy; Acoustic testing; Databases; Decoding; Hidden Markov models; Natural languages; Performance evaluation; Speech recognition; Training data; Viterbi algorithm; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1988. ICASSP-88., 1988 International Conference on
  • Conference_Location
    New York, NY
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.1988.196629
  • Filename
    196629