Duration modeling in a restricted-domain female-voice synthesis in Spanish using neural networks

Author

Córdoba, R. ; Montero, J.M. ; Gutierrez-Arriola, J. ; Pardo, J.M.

Author_Institution

Dept. de Ingenieria Electron., Univ. Politecnica de Madrid, Spain

Volume

2

fYear

2001

fDate

2001

Firstpage

793

Abstract

The objective of this paper is the accurate prediction of segmental duration in a Spanish text-to-speech system. There are many parameters that affect duration, but not all of them are always relevant. We present a complete environment in which to decide which parameters are more relevant and the best way to code them. This work is the continuation of Cordoba et al. (1999), where all efforts were dedicated to an unrestricted-domain database for a male voice. In this case, we are considering a female voice in a restricted-domain environment. This restricted-domain offers several advantages to the modeling: the variation in the different patterns is reduced, and so most of the decisions we have made about the parameters are now based in more significant results. So, the conclusions that we present now show clearly which parameters are best. The system is based in a neural network absolutely configurable

Keywords

neural nets; speech synthesis; Spanish text-to-speech system; duration modeling; neural networks; restricted-domain environment; restricted-domain female voice synthesis; segmental duration; Databases; Decision trees; Elasticity; Intelligent networks; Network synthesis; Neural networks; Predictive models; Speech synthesis; Stress; Telecommunications;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on

Conference_Location

Salt Lake City, UT

ISSN

1520-6149

Print_ISBN

0-7803-7041-4

Type

conf

DOI

10.1109/ICASSP.2001.941034

Filename

941034