100000-word recognition using acoustic-segment networks

Author

Kimura, Shinta

Author_Institution

Fujitsu Lab. Ltd., Kawasaki, Japan

fYear

1990

fDate

3-6 Apr 1990

Firstpage

61

Abstract

Speech recognition for a vocabulary of 100000 words is described. Acoustic-segment networks are used as word templates in recognition. The acoustic-segment networks are automatically generated from orthographic strings of the words using rules that account for several kinds of variations in speech. To reduce the amount of computation in recognition, a tree representation of the networks and a preselection method based on input-frame sampling are used. It is confirmed that 98.75% of the computation can be eliminated without a significant increase of error, when using the preselection which outputs 500 candidates for main matching. Top-20 recognition accuracy is 93.5% for 10000 test utterances of five males and five females

Keywords

knowledge based systems; speech analysis and processing; speech recognition; Japanese words; acoustic-segment networks; females; input-frame sampling; males; matching; orthographic strings of words; preselection method; recognition accuracy; test utterances; tree representation; variations in speech; vocabulary of 100000 words; word recognition; word templates; Acoustic testing; Automatic speech recognition; Computer networks; Information processing; Laboratories; Loudspeakers; Sampling methods; Speech analysis; Speech recognition; Testing; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1990. ICASSP-90., 1990 International Conference on

Conference_Location

Albuquerque, NM

ISSN

1520-6149

Type

conf

DOI

10.1109/ICASSP.1990.115537

Filename

115537