Title :
Lexical access to large vocabularies for speech recognition
Author :
Fissore, Luciano ; Laface, Pietro ; Micca, Giorgio ; Pieraccini, Roberto
Author_Institution :
Centro Studi e Lab. Telecommun., Torino, Italy
fDate :
8/1/1989 12:00:00 AM
Abstract :
A large-vocabulary isolated-word recognition system based on the hypothesize-and-test paradigm is described. Word preselection is achieved by segmenting and classifying the input signal in terms of broad phonetic classes. A lattice of phonetic segments is generated and organized as a graph. Word hypothesization is obtained by matching this graph against the models of all vocabulary words, where a word model is itself a phonetic representation made in terms of a graph. A modified dynamic programming matching procedure gives an efficient solution to this graph-to-graph matching problem. Hidden Markov models (HMMs) of subword units are used as a more detailed knowledge in the verification step. The word candidates generated by the previous step are represented as sequences of diphone-like subword units, and the Viterbi algorithm is used for evaluating their likelihood. Lexical knowledge is organized in a tree structure where the initial common subsequences of word descriptions are shared, and a beam-search strategy carries on the most promising paths only. The results show that a complexity reduction of about 73% can be achieved by using the two-pass approach with respect to the direct approach, while the recognition accuracy remains comparable
Keywords :
speech recognition; Markov models; Viterbi algorithm; beam-search strategy; diphone-like subword units; dynamic programming matching; graph-to-graph matching; hypothesize-and-test paradigm; isolated-word recognition system; large vocabularies; phonetic classes; speech recognition; tree structure; Computational efficiency; Databases; Dynamic programming; Hidden Markov models; Lattices; Pattern matching; Speech recognition; Telephony; Viterbi algorithm; Vocabulary;
Journal_Title :
Acoustics, Speech and Signal Processing, IEEE Transactions on