DocumentCode :
3169268
Title :
Towards a high quality Finnish talking head
Author :
Olivés, Jean-Luc ; Sam, Mikko ; Kulju, Janne ; Seppälä, Otto ; Karjalainen, Matti ; Altosaar, Toomas ; Lemmetty, Sami ; Töyrä, Kristian ; Vainio, Martti
Author_Institution :
Lab. of Comput. Eng., Helsinki Univ. of Technol., Espoo, Finland
fYear :
1999
fDate :
1999
Firstpage :
433
Lastpage :
437
Abstract :
We describe how our Finnish talking head was improved by using a new auditory speech synthesis method based on neural networks and optimal synchronization of the facial speech animation and the audio signal. In our first version of the talking head, the user typed in text and synthesized auditory speech and synchronized facial animation were created automatically. We combine a 3D facial model with a commercial auditory text-to-speech synthetizer (TTS). The auditory speech is produced by concatenating pre-recorded samples of natural speech according to a set of rules. The quality of the current speech synthesis is not yet adequate. A new strategy has been developed to improve the TTS and to integrate auditory synthesizer synchronization, especially when hardware capabilities are limited. We are developing a new method to achieve an optimal synchronization, independent of the platform used. This method is based on predictive visual synthesis. The new synchronization method gives us better control over audio-visual speech synthesis in the time domain. Using the diphone duration, we can use a more realistic interpolation function between the visemes. Thus, we can also take into account coarticulation effects
Keywords :
computer animation; multimedia computing; natural languages; speech synthesis; synchronisation; 3D facial model; Finnish talking head; audio signal; audio-visual speech synthesis; auditory speech synthesis method; auditory text-to-speech synthetizer; coarticulation effects; diphone duration; facial speech animation; natural speech; neural networks; optimal synchronization; predictive visual synthesis; synchronized facial animation; synthesized auditory speech; talking heads; Facial animation; Filters; Frequency synchronization; Laboratories; Network synthesis; Neural networks; Signal synthesis; Speech synthesis; Synthesizers; Table lookup;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia Signal Processing, 1999 IEEE 3rd Workshop on
Conference_Location :
Copenhagen
Print_ISBN :
0-7803-5610-1
Type :
conf
DOI :
10.1109/MMSP.1999.793886
Filename :
793886
Link To Document :
بازگشت