DocumentCode :
3530182
Title :
Main vowel domain tone modeling with lexical and prosodic analysis for Mandarin ASR
Author :
Zhang, Shilei ; Shi, Qin ; Chu, Stephen M. ; Qin, Yong
Author_Institution :
IBM China Res. Lab., Beijing
fYear :
2009
fDate :
19-24 April 2009
Firstpage :
4561
Lastpage :
4564
Abstract :
The tone is a distinctive discriminative feature in Mandarin Chinese. Often functional, yet seldom thorough are most large-scale Mandarin speech recognition systems in treating tone modeling. In particular, many lack the necessary sophistication to deal with the myriad variations arising from the combination of acoustic and lexical contexts. This paper reports an attempt to account for these variabilities and to bring richer tone modeling into the IBM Mandarin broadcast transcription system. In particular, we describe a system that combines the embedded approach and a novel explicit tone modeling technique characterized by a. robust tone tracking in the main-vowel domain, and b. context-dependent models with lexical and prosodic contexts. The proposed method is validated on a connected-digits set and subsequently evaluated on a large-vocabulary broadcast transcription task. It is shown that 14.8% and 5.4% relative reductions in character error rate are achieved respectively.
Keywords :
natural language processing; speech recognition; vocabulary; IBM Mandarin broadcast transcription system; Mandarin ASR; Mandarin Chinese; Mandarin speech recognition system; acoustic context; character error rate; connected-digits set; context-dependent model; large-vocabulary broadcast transcription; lexical analysis; lexical context; main vowel domain tone modeling; prosodic analysis; robust tone tracking; Automatic speech recognition; Broadcasting; Context modeling; Decision trees; Lattices; Natural languages; Parameter estimation; Robustness; Speech recognition; Speech synthesis; decision tree; lattice rescoring; main vowel; tone domain; tone models;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
ISSN :
1520-6149
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2009.4960645
Filename :
4960645
Link To Document :
بازگشت