Title :
100000-word recognition using acoustic-segment networks
Author_Institution :
Fujitsu Lab. Ltd., Kawasaki, Japan
Abstract :
Speech recognition for a vocabulary of 100000 words is described. Acoustic-segment networks are used as word templates in recognition. The acoustic-segment networks are automatically generated from orthographic strings of the words using rules that account for several kinds of variations in speech. To reduce the amount of computation in recognition, a tree representation of the networks and a preselection method based on input-frame sampling are used. It is confirmed that 98.75% of the computation can be eliminated without a significant increase of error, when using the preselection which outputs 500 candidates for main matching. Top-20 recognition accuracy is 93.5% for 10000 test utterances of five males and five females
Keywords :
knowledge based systems; speech analysis and processing; speech recognition; Japanese words; acoustic-segment networks; females; input-frame sampling; males; matching; orthographic strings of words; preselection method; recognition accuracy; test utterances; tree representation; variations in speech; vocabulary of 100000 words; word recognition; word templates; Acoustic testing; Automatic speech recognition; Computer networks; Information processing; Laboratories; Loudspeakers; Sampling methods; Speech analysis; Speech recognition; Testing; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1990. ICASSP-90., 1990 International Conference on
Conference_Location :
Albuquerque, NM
DOI :
10.1109/ICASSP.1990.115537