مرکز منطقه ای اطلاع رساني علوم و فناوري - Joint modelling of voicing label and continuous F0 for HMM based speech synthesis

DocumentCode :

2174416

Title :

Joint modelling of voicing label and continuous F0 for HMM based speech synthesis

Author :

Yu, K. ; Young, S.

Author_Institution :

Eng. Dept., Cambridge Univ., Cambridge, UK

fYear :

2011

fDate :

22-27 May 2011

Firstpage :

4572

Lastpage :

4575

Abstract :

Fundamental frequency, or F0 is critical for high quality speech syn thesis in HMM based speech synthesis. Traditionally, F0 values are considered to depend on a binary voicing decision such that they are continuous in voiced regions and undefined in unvoiced regions. Multi-space distribution HMM (MSDHMM) has been used for modelling the discontinuous F0. Recently, a continuous F0 modelling framework has been proposed and shown to be effective, where continuous F0 observations are assumed to always exist and voicing labels are explicitly modelled by an independent stream. In this paper, a refined continuous F0 modelling approach is proposed. Here, F0 values are assumed to be dependent on voicing labels and both are jointly modelled in a single stream. Due to the enforced dependency, the new method can effectively reduce the voicing classification error. Subjective listening tests also demonstrate that the new approach can yield significant improvements on the naturalness of the synthesised speech. A dynamic random unvoiced F0 generation method is also investigated. Experiments show that it has significant effect on the quality of synthesised speech.

Keywords :

hidden Markov models; speech synthesis; HMM based speech synthesis; MSDHMM; binary voicing decision; continuous F0 modelling; dynamic random unvoiced F0 generation method; high quality speech synthesis; joint modelling; multispace distribution HMM; voicing label; Correlation; Hidden Markov models; Joints; Mathematical model; Speech; Speech synthesis; Training; HMM based speech synthesis; continuous F0 modelling; voicing classification;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on

Conference_Location :

Prague

ISSN :

1520-6149

Print_ISBN :

978-1-4577-0538-0

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2011.5947372

Filename :

5947372

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2174416