Lexical access to large vocabularies for speech recognition

Author

Fissore, Luciano ; Laface, Pietro ; Micca, Giorgio ; Pieraccini, Roberto

Author_Institution

Centro Studi e Lab. Telecommun., Torino, Italy

Volume

37

Issue

8

fYear

1989

fDate

8/1/1989 12:00:00 AM

Firstpage

1197

Lastpage

1213

Abstract

A large-vocabulary isolated-word recognition system based on the hypothesize-and-test paradigm is described. Word preselection is achieved by segmenting and classifying the input signal in terms of broad phonetic classes. A lattice of phonetic segments is generated and organized as a graph. Word hypothesization is obtained by matching this graph against the models of all vocabulary words, where a word model is itself a phonetic representation made in terms of a graph. A modified dynamic programming matching procedure gives an efficient solution to this graph-to-graph matching problem. Hidden Markov models (HMMs) of subword units are used as a more detailed knowledge in the verification step. The word candidates generated by the previous step are represented as sequences of diphone-like subword units, and the Viterbi algorithm is used for evaluating their likelihood. Lexical knowledge is organized in a tree structure where the initial common subsequences of word descriptions are shared, and a beam-search strategy carries on the most promising paths only. The results show that a complexity reduction of about 73% can be achieved by using the two-pass approach with respect to the direct approach, while the recognition accuracy remains comparable

Keywords

speech recognition; Markov models; Viterbi algorithm; beam-search strategy; diphone-like subword units; dynamic programming matching; graph-to-graph matching; hypothesize-and-test paradigm; isolated-word recognition system; large vocabularies; phonetic classes; speech recognition; tree structure; Computational efficiency; Databases; Dynamic programming; Hidden Markov models; Lattices; Pattern matching; Speech recognition; Telephony; Viterbi algorithm; Vocabulary;

fLanguage

English

Journal_Title

Acoustics, Speech and Signal Processing, IEEE Transactions on

Publisher

ieee

ISSN

0096-3518

Type

jour

DOI

10.1109/29.31268

Filename

31268