Use of prosodic information for Mandarin word post-recognition

Author

Wu, Chung-Hsien ; Chen, Yeou-Jiunn

Author_Institution

Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan

Volume

1

fYear

1997

fDate

4-4 Dec. 1997

Firstpage

253

Abstract

A two-stage recognition scheme, phonetic recognition followed by prosodic recognition is established. In the phonetic recognition process, 21 initial and 37 final context-independent HMMs are used to construct the phonetic recognizer. In the prosodic recognizer, 175 context-dependent prosodic HMMs are used to model the complicated tone behavior for all possible tone concatenations. Five anti-prosodic HMMs, each corresponding to one lexical tone, are constructed to enhance the discrimination among prosodic HMMs. This system was evaluated in a speaker-dependent mode on a vocabulary size of thirty thousand words. The experimental results show that the recognition rate was improved from 80.3% to 86.7% using the prosodic information.

Keywords

feature extraction; hidden Markov models; natural languages; speech recognition; Mandarin word post-recognition; context-dependent prosodic HMM; experimental results; feature extraction; phonetic recognition; prosodic information; prosodic recognition; recognition rate; speaker-dependent system; tone concatenations; two-stage recognition; vocabulary size; Computer science; Context modeling; Data mining; Databases; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Speech recognition; Viterbi algorithm; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

TENCON '97. IEEE Region 10 Annual Conference. Speech and Image Technologies for Computing and Telecommunications., Proceedings of IEEE

Conference_Location

Brisbane, Qld., Australia

Print_ISBN

0-7803-4365-4

Type

conf

DOI

10.1109/TENCON.1997.647305

Filename

647305