DocumentCode :
1909595
Title :
A Comparative Study of Diverse Knowledge Sources and Smoothing Techniques via Maximum Entropy for Polyphone Disambiguation in Mandarin TTS Systems
Author :
MAO, Xinnian ; Dong, Yuan ; Han, Jinyu ; Wang, Haila
Author_Institution :
France Telecom R&D Center, Beijing
fYear :
2007
fDate :
Aug. 30 2007-Sept. 1 2007
Firstpage :
162
Lastpage :
169
Abstract :
This paper comparatively evaluated various knowledge sources and smoothing algorithms for pronunciation disambiguation in Mandarin TTS (text-to-speech) systems under maximum entropy (maxent) framework. In particular, five kinds of knowledge sources, namely characters and their pronunciations, words, their pronunciations and part-of-speech. together with two smoothing algorithms, i.e. Gaussian prior and inequality were compared. In our experiments conducted on 107 key Chinese polyphones. we found that all the knowledge sources almost perform equally well given the same smoothing measure, but the character-based features compare favorably because they are language independent and can be obtained with the lowest computation cost. Compared with the widely-used Gaussian smoothing, the equality smoothing greatly reduces the number of active features and yields a slightly improved accuracy on each knowledge source. Our best result (96.36%) is achieved by using character-based features together with the inequality smoothing, significantly superior to 81.22% by selecting the most frequent pronunciations and 88.72% by dictionary look-up with the part-of-speech. We also compared the maxent classifier with the transform-based error-driven learning algorithm (E. Brill, 1995) using the same knowledge sources, the results show that the maxent classifier achieve better performance to solve the polyphone disambiguation.
Keywords :
Gaussian processes; maximum entropy methods; smoothing methods; speech processing; Chinese polyphones; Gaussian prior algorithm; Mandarin TTS systems; character-based features; diverse knowledge sources; inequality algorithm; maxent classifier; maxent framework; maximum entropy; polyphone disambiguation; pronunciation disambiguation; smoothing algorithms; smoothing techniques; text-to-speech system; transform-based error-driven learning algorithm; Entropy; Impedance matching; Information analysis; Natural languages; Research and development; Smoothing methods; Speech synthesis; Telecommunications; Text analysis; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007. International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-1611-0
Electronic_ISBN :
978-1-4244-1611-0
Type :
conf
DOI :
10.1109/NLPKE.2007.4368028
Filename :
4368028
Link To Document :
بازگشت