Using different models to label the break indices for mandarin speech synthesis

Author

Shao, Yan-Qiu ; Zhao, Yong-Zhen ; Han, Ji-Qing ; Liu, Ting

Author_Institution

Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., China

Volume

6

fYear

2005

fDate

18-21 Aug. 2005

Firstpage

3802

Abstract

High quality speech synthesis system requires effective prediction of break indices. This paper adopts a large scale corpus with five-tier break indices annotated according to C-TOBI. Based on it, several models including N-gram, artificial neural network and Markov model are employed to automatically label the break indices for unrestricted mandarin text. These approaches differ not only in models, but also in features. The results show that among these three models, MM can give the best result. The accuracy reaches 77.0% and the average error cost is 0.155. These three models are compared with each other, and some conclusions are made to dig into the problem.

Keywords

Markov processes; neural nets; speech synthesis; text analysis; Markov model; N-gram; artificial neural network; break indices; mandarin text; speech synthesis system; Artificial neural networks; Computer science; Costs; Large-scale systems; Network synthesis; Predictive models; Speech analysis; Speech synthesis; Stochastic processes; Synthesizers; Artificial Neural Network; Break Indices; Markov Model; N-gram; Speech Synthesis;

fLanguage

English

Publisher

ieee

Conference_Titel

Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on

Conference_Location

Guangzhou, China

Print_ISBN

0-7803-9091-1

Type

conf

DOI

10.1109/ICMLC.2005.1527602

Filename

1527602