DocumentCode
442057
Title
Using different models to label the break indices for mandarin speech synthesis
Author
Shao, Yan-Qiu ; Zhao, Yong-Zhen ; Han, Ji-Qing ; Liu, Ting
Author_Institution
Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., China
Volume
6
fYear
2005
fDate
18-21 Aug. 2005
Firstpage
3802
Abstract
High quality speech synthesis system requires effective prediction of break indices. This paper adopts a large scale corpus with five-tier break indices annotated according to C-TOBI. Based on it, several models including N-gram, artificial neural network and Markov model are employed to automatically label the break indices for unrestricted mandarin text. These approaches differ not only in models, but also in features. The results show that among these three models, MM can give the best result. The accuracy reaches 77.0% and the average error cost is 0.155. These three models are compared with each other, and some conclusions are made to dig into the problem.
Keywords
Markov processes; neural nets; speech synthesis; text analysis; Markov model; N-gram; artificial neural network; break indices; mandarin text; speech synthesis system; Artificial neural networks; Computer science; Costs; Large-scale systems; Network synthesis; Predictive models; Speech analysis; Speech synthesis; Stochastic processes; Synthesizers; Artificial Neural Network; Break Indices; Markov Model; N-gram; Speech Synthesis;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
Conference_Location
Guangzhou, China
Print_ISBN
0-7803-9091-1
Type
conf
DOI
10.1109/ICMLC.2005.1527602
Filename
1527602
Link To Document