Title :
Maximum entropy based tone modeling for mandarin speech recognition
Author :
Wang, Xinhao ; Yu, Yansuo ; Wu, Xihong ; Chi, Huisheng
Author_Institution :
Key Lab. of Machine Perception (Minist. of Educ.), Peking Univ., Beijing, China
Abstract :
To explore the potential of prosody for Mandarin speech recognition, this paper addresses the tone modeling problem and its integration issue. This study adopts the maximum entropy approach to capture both acoustic and lexical characteristics of tones due to its flexibility in handling multiple interacting features. Moreover, considering the phoneme factor, besides a tone model, a phoneme dependent model is also constructed. With regard to the model integration, the presented models are integrated into the recognizer under the one-pass decoding framework, where they are used to prune the active word-final states during beam search. Experimental results on the HUB-4 evaluation material reveal the effectiveness of the presented models. They significantly improve the performance of speech recognition with 7.6% and 11.1% relative reduction of character error rate.
Keywords :
entropy; natural language processing; speech recognition; HUB-4 evaluation material; Mandarin speech recognition; maximum entropy based tone modeling; phoneme dependent model; phoneme factor; Auditory system; Computer science education; Decoding; Entropy; Hidden Markov models; Laboratories; Lattices; Natural languages; Pattern recognition; Speech recognition; Mandarin speech recognition; Maximum Entropy; One-pass decoding; Tone modeling;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495129