Incorporation of phrase intonation to context clustering for average voice models in HMM-based Thai speech synthesis

Author

Chomphan, Suphattharachai ; Kobayashi, Takao

Author_Institution

Interdiscipl. Grad. Sch. of Sci. & Eng., Tokyo Inst. of Technol., Yokohama

fYear

2008

fDate

March 31 2008-April 4 2008

Firstpage

4637

Lastpage

4640

Abstract

This paper describes a novel approach to the context clustering process in a speaker independent HMM-based Thai speech synthesis for improvement of the tone intelligibility of the average voice and also the speaker adapted voice. A couple of phrase intonation features from a generative model including a baseline value of fundamental frequency and a phrase command amplitude are extracted and thereafter exploited in the context clustering process of HMM training stage. In the experiments, subjective evaluations of both average voice and adapted voice in terms of the intelligibility of tone are conducted. The results show that the tone correctness of the synthesized speech is significantly improved.

Keywords

hidden Markov models; speech synthesis; average voice models; context clustering; phrase intonation; speaker independent HMM-based Thai speech synthesis; Context modeling; Databases; Decision trees; Frequency; Hidden Markov models; Loudspeakers; Natural languages; Speech synthesis; Statistical distributions; Training data; Phrase intonation; Thai tone; average voice; hidden Markov models; speech synthesis;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on

Conference_Location

Las Vegas, NV

ISSN

1520-6149

Print_ISBN

978-1-4244-1483-3

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2008.4518690

Filename

4518690