Golden Mandarin (I)-A real-time Mandarin speech dictation machine for Chinese language with very large vocabulary

Author

Lee, Lin-shan ; Tseng, Chiu-Yu ; Gu, Hung-Yan ; Liu, Fu-hua ; Chang, Chen-hao ; Lin, Yueh-hong ; Lee, Yumin ; Tu, Shih-Lung ; Hsieh, Shew-Heng ; Chen, Chian-hung

Author_Institution

Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Univ., Taipei, Taiwan

Volume

1

Issue

2

fYear

1993

fDate

4/1/1993 12:00:00 AM

Firstpage

158

Lastpage

179

Abstract

The first successfully implemented real-time Mandarin dictation machine, which recognizes Mandarin speech with very large vocabulary and almost unlimited texts for the input of Chinese characters into computers, is described. The machine is speaker-dependent, and the input speech is in the form of sequences of isolated syllables. The machine can be decomposed into two subsystems. The first subsystem recognizes the syllables using hidden Markov models. Because every syllable can represent many different homonym characters and form different multisyllabic words with syllables on its right or left, the second subsystem is needed to identify the exact characters from the syllables and correct the errors in syllable recognition. The real-time implementation is on an IBM PC/AT, connected to three sets of specially designed hardware boards on which seven TMS 320C25 chips operate in parallel. The preliminary test results indicate that it takes only about 0.45 s to dictate a syllable (or character) with an accuracy on the order of 90%

Keywords

dictation; hidden Markov models; real-time systems; speech recognition equipment; Chinese language; Golden Mandarin (I); HMM; IBM PC/AT; Mandarin speech dictation machine; TMS 320C25 chips; hidden Markov models; homonym characters; multisyllabic words; real-time implementation; sequences of isolated syllables; speaker-dependent; speech recognition; syllable recognition; very large vocabulary; voice input; Character recognition; Computer science; Error correction; Hardware; Hidden Markov models; Lattices; Natural languages; Speech recognition; Text recognition; Vocabulary;

fLanguage

English

Journal_Title

Speech and Audio Processing, IEEE Transactions on

Publisher

ieee

ISSN

1063-6676

Type

jour

DOI

10.1109/89.222876

Filename

222876