Nonlinear dynamic modeling of the voiced excitation for improved speech synthesis

Author

Narasimhan, Karthik ; Principe, Jose C. ; Childers, Donald G.

Author_Institution

Comput. Neuroeng. Lab., Florida Univ., Gainesville, FL, USA

Volume

1

fYear

1999

fDate

15-19 Mar 1999

Firstpage

389

Abstract

This paper describes the implementation of a waveform-based global dynamic model with the goal of capturing vocal folds variability. The residue extracted from speech by inverse filtering is pre-processed to remove phoneme dependence and is used as the input time series to the dynamic model. After training, the dynamic model is seeded with a point from the trajectory of the time series, and iterated to produce the synthetic excitation waveform. The output of the dynamic model is compared with the input time series. These comparisons confirmed that the dynamic model had captured the variability in the residue. The output of the dynamic models is used to synthesize speech using a pitch-synchronous speech synthesizer, and the output is observed to be close to natural speech

Keywords

filtering theory; inverse problems; speech intelligibility; speech synthesis; time series; waveform analysis; input time series; inverse filtering; natural speech; nonlinear dynamic modeling; pitch-synchronous speech synthesizer; speech synthesis; synthetic excitation waveform; time series trajectory; training; vocal folds variability; voiced excitation; waveform-based global dynamic model; Filtering; Laboratories; Mathematical model; Natural languages; Neural engineering; Nonlinear dynamical systems; Oscillators; Predictive models; Speech synthesis; Synthesizers;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on

Conference_Location

Phoenix, AZ

ISSN

1520-6149

Print_ISBN

0-7803-5041-3

Type

conf

DOI

10.1109/ICASSP.1999.758144

Filename

758144