Parameterized visual speech synthesis and its evaluation

Author

Mottonen, Riikka ; Olives, Jean-Luc ; Kulju, Janne ; Sams, Mikko

Author_Institution

Helsinki University of Technology, Laboratory of Computational Engineering, P.O. Box 9400, FIN-02015 HUT, Finland

fYear

2000

fDate

4-8 Sept. 2000

Firstpage

Lastpage

Abstract

We have constructed an audio-visual text-to-speech synthesizer for Finnish by combining a dynamic facial model with an acoustic speech synthesizer. The visual speech is based on a letter-to-viseme mapping and the animation is created by linear interpolation between the visemes. A viseme is defined by 12 parameter values. In a recent study we showed that visual speech increases the intelligibility of both natural and synthetic auditory speech [5]. We have upgraded our visual speech synthesis by adding the tongue model and improving the speech parameters on the basis of the intelligibility study. Here we show data from a new intelligibility study demonstrating the improved performance of the synthesizer. Presenting the visual speech in three-dimensional space did not further improve the intelligibility.

Keywords

Computational modeling; Joints; Natural languages; Speech; Synthesizers; Tongue; Visualization;

fLanguage

English

Publisher

ieee

Conference_Titel

Signal Processing Conference, 2000 10th European

Conference_Location

Tampere, Finland

Print_ISBN

978-952-1504-43-3

Type

conf

Filename

7075406

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=696785