• DocumentCode
    1157027
  • Title

    Lexical access to large vocabularies for speech recognition

  • Author

    Fissore, Luciano ; Laface, Pietro ; Micca, Giorgio ; Pieraccini, Roberto

  • Author_Institution
    Centro Studi e Lab. Telecommun., Torino, Italy
  • Volume
    37
  • Issue
    8
  • fYear
    1989
  • fDate
    8/1/1989 12:00:00 AM
  • Firstpage
    1197
  • Lastpage
    1213
  • Abstract
    A large-vocabulary isolated-word recognition system based on the hypothesize-and-test paradigm is described. Word preselection is achieved by segmenting and classifying the input signal in terms of broad phonetic classes. A lattice of phonetic segments is generated and organized as a graph. Word hypothesization is obtained by matching this graph against the models of all vocabulary words, where a word model is itself a phonetic representation made in terms of a graph. A modified dynamic programming matching procedure gives an efficient solution to this graph-to-graph matching problem. Hidden Markov models (HMMs) of subword units are used as a more detailed knowledge in the verification step. The word candidates generated by the previous step are represented as sequences of diphone-like subword units, and the Viterbi algorithm is used for evaluating their likelihood. Lexical knowledge is organized in a tree structure where the initial common subsequences of word descriptions are shared, and a beam-search strategy carries on the most promising paths only. The results show that a complexity reduction of about 73% can be achieved by using the two-pass approach with respect to the direct approach, while the recognition accuracy remains comparable
  • Keywords
    speech recognition; Markov models; Viterbi algorithm; beam-search strategy; diphone-like subword units; dynamic programming matching; graph-to-graph matching; hypothesize-and-test paradigm; isolated-word recognition system; large vocabularies; phonetic classes; speech recognition; tree structure; Computational efficiency; Databases; Dynamic programming; Hidden Markov models; Lattices; Pattern matching; Speech recognition; Telephony; Viterbi algorithm; Vocabulary;
  • fLanguage
    English
  • Journal_Title
    Acoustics, Speech and Signal Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0096-3518
  • Type

    jour

  • DOI
    10.1109/29.31268
  • Filename
    31268