Title :
Connected sentence recognition using diphone-like templates
Author_Institution :
AT&T Bell Lab., Murray Hill, NJ, USA
Abstract :
A template-based connected speech recognition system which represents words as sequences of diphone-like segments has been implemented and tested on a database of 50 phonetically balanced sentences uttered 5 times by a single male talker. The sentences contain 250 words, of which, 80% are monosyllabic. The inventory of segments is divided into two principal classes, single phone segments, such as vowels, nasals, fricatives, and stop bursts, and diphone segments including consonant-vowel, vowel-consonant, and consonant-consonant combinations. Words are represented by network models whose nodes are these segments. Word models incorporate juncture branches to and from other words. 400 segments are required to represent the 250 vocabulary words. Templates representing these segments are extracted from a database of 450 training sentences uttered by the same talker. Recognition is carried out by a series of matching and search processes, successively for segments, words, word strings, and sentences. The performance obtained to data has yielded 63% correct recognition of content words and approximately 30% recognition of function words
Keywords :
acoustic signal processing; speech analysis and processing; speech recognition; connected sentence recognition; connected speech recognition system; consonant-consonant; consonant-vowel; database; diphone segments; diphone-like templates; fricatives; nasals; network models; phonetically balanced sentences; single phone segments; stop bursts; training sentences; vocabulary; vowel-consonant; vowels; Autocorrelation; Cepstral analysis; Error analysis; Linear predictive coding; Speech recognition; Steady-state; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1988. ICASSP-88., 1988 International Conference on
Conference_Location :
New York, NY
DOI :
10.1109/ICASSP.1988.196621