Sample-based synthesis of talking heads

Author

Graf, Hans Peter ; Cosatto, Eric

Author_Institution

AT&T Labs., Middletown, NJ, USA

fYear

2001

fDate

2001

Firstpage

3

Lastpage

7

Abstract

Synthesizing photo-realistic talking heads is a challenging problem, and so far all attempts using conventional computer graphics produced heads with a distinctly synthetic look. In order to look credible, a head must show a picture-perfect appearance, natural head movements, and good lip-sound synchronization. We use sample-based graphics to achieve more photo-realistic appearances than what is possible with the traditional approach of 3D modeling and texture mapping. For sample-based graphics, first parts of faces are cut from recorded images and are scored in a database. New sequences are then synthesized by integrating such parts into whole faces. With sufficient recorded data this approach produces by far the most naturally looking speech articulation. We integrate 3D modeling with the sample-based technique in order to enhance its flexibility. This allows, for example, showing the head over a much wider range of orientations

Keywords

computer animation; image sequences; solid modelling; synchronisation; 3D modeling; computer graphics; head movements; image sequences; lip-sound synchronization; photo-realistic appearances; picture-perfect appearance; sample-based graphics; talking heads; Computer graphics; Computer interfaces; Customer service; Facial animation; Image databases; Lips; Magnetic heads; Shape; Speech synthesis; Videos;

fLanguage

English

Publisher

ieee

Conference_Titel

Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, 2001. Proceedings. IEEE ICCV Workshop on

Conference_Location

Vancouver, BC

ISSN

1530-1044

Print_ISBN

0-7695-1074-4

Type

conf

DOI

10.1109/RATFG.2001.938903

Filename

938903