A continuous speech recognition system using finite state network and Viterbi beam search for the automatic interpretation

Author

Han, Nam-Yong ; Kim, Hoi-Rin ; Hwang, Kyu-Woong ; Ahn, Young-Mok ; Ryoo, Joon-Hyung

Author_Institution

Electron. & Telecommun. Res. Inst., Seoul, South Korea

Volume

1

fYear

1995

fDate

9-12 May 1995

Firstpage

117

Abstract

This paper describes a Korean continuous speech recognition system using phone based semi-continuous hidden Markov model (SCHMM) method for automatic interpretation. The task domain is hotel reservation. The system (composed of speech recognition, machine translation and speech synthesis) has the following three features. First, an embedded bootstrapping training method is used that enables us to train each phone model without the need for a phoneme segmentation database. Second, a hybrid estimation method which is composed of the forward-backward algorithm and the Viterbi algorithm is proposed for the HMM parameter estimation. Third, a between-word modeling technique is used at the function word boundaries. The recognition results in speaker independent experiments are as follows. In the case of Version 1, the continuous speech recognition result is 89.1% and in Version 2, the result is 97.6%

Keywords

finite state machines; hidden Markov models; hotel industry; language translation; maximum likelihood estimation; natural languages; parameter estimation; reservation computer systems; search problems; speech recognition; speech synthesis; HMM parameter estimation; Korean continuous speech recognition system; SCHMM; Viterbi algorithm; Viterbi beam search; automatic interpretation; between-word modeling technique; embedded bootstrapping training method; finite state network; forward-backward algorithm; function word boundaries; hotel reservation; hybrid estimation method; language model; machine translation; phone based semi-continuous hidden Markov model; phone model; recognition results; speaker independent experiments; speech synthesis; Cepstral analysis; Dictionaries; Electronic mail; Gold; Hidden Markov models; Linear predictive coding; Spatial databases; Speech recognition; Viterbi algorithm; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on

Conference_Location

Detroit, MI

ISSN

1520-6149

Print_ISBN

0-7803-2431-5

Type

conf

DOI

10.1109/ICASSP.1995.479287

Filename

479287