Demisyllable-based isolated word recognition system

Author

Rosenberg, Aaron E. ; Rabiner, Lawrence R. ; Wilpon, Jay G. ; Kahn, Daniel

Author_Institution

Bell Laboratories, Murray Hill, NJ, USA

Volume

31

Issue

3

fYear

1983

fDate

6/1/1983 12:00:00 AM

Firstpage

713

Lastpage

726

Abstract

A speaker-dependent speech recognition system is described for recognizing isolated word utterances using reference templates created by concatenating demisyllable (half-syllable) prototypes. Each word in a vocabulary is specified by one or more entries in a user-supplied lexicon containing a sequence of demisyllables drawn from a corpus of some 1000 units. Experiments were carried out with two talkers using a 1109-word "Basic English" vocabulary to assess the overall effectiveness of demisyllable representations for words. Also, the effects on performance of some simple modifications in demisyllable specifications and adjustments of demisyllable durations were investigated. The recognition error rates obtained for this vocabulary using demisyllable prototypes were 18-33 percent compared with 6-15 percent using whole word prototypes. Although the performance is substantially poorer using demisyllable representations in place of whole words, the approach of using a fixed inventory of smaller-than-word recognition units capable of representing any spoken word in a simple concatenative scheme is clearly an attractive alternative to whole-word prototypes for large-size vocabularies. The approach also has the potential of being effective in representing and recognizing continuous spoken utterances.

Keywords

Acoustic signal processing; Automatic speech recognition; Error analysis; Pattern recognition; Performance evaluation; Prototypes; Signal processing algorithms; Speech processing; Speech recognition; Vocabulary;

fLanguage

English

Journal_Title

Acoustics, Speech and Signal Processing, IEEE Transactions on

Publisher

ieee

ISSN

0096-3518

Type

jour

DOI

10.1109/TASSP.1983.1164132

Filename

1164132