Face analysis for the synthesis of photo-realistic talking heads

Author

Graf, Hans Peter ; Cosatto, Eric ; Ezzat, Tony

Author_Institution

AT&T Labs-Res., Red Bank, NJ, USA

fYear

2000

fDate

2000

Firstpage

189

Lastpage

194

Abstract

This paper describes techniques for extracting bitmaps of facial parts from videos of a talking person. The goal is to synthesize photo-realistic talking heads of high quality that show picture-perfect appearance and realistic head movements with good lip-sound synchronization. For the synthesis of a talking head, bitmaps of facial parts are combined to form whole heads and then sequences of such images are integrated with audio from a text-to-speech synthesizer. For a seamless integration of facial parts into an animation, their shape and visual appearance must be known with high accuracy. The recognition system has to find not only the locations of facial features, but must also be able to determine the head´s orientation and recognize the facial expressions. Our face recognition proceeds in multiple steps, each with an increased precision. Using motion, color and shape information, the head´s position and the location of the main facial features are determined first. Then smaller areas are searched with matched filters, in order to identify specific facial features with high precision. From this information a head´s 3D orientation is calculated. Facial parts are cut from the image and, using the head´s orientation, are warped into bitmaps with `normalized´ orientation and scale

Keywords

computer animation; face recognition; feature extraction; filtering theory; image colour analysis; image sequences; matched filters; motion estimation; realistic images; search problems; speech synthesis; synchronisation; animation; audio integration; bitmap extraction; color information; face analysis; face recognition; facial feature location; head orientation; image sequences; lip-sound synchronization; matched filters; motion information; photo-realistic talking heads; picture-perfect appearance; realistic head movements; recognition system; searching; shape information; talking head synthesis; text-to-speech synthesizer; videos; Bridges; Electrical capacitance tomography; Face recognition; Facial animation; Identity-based encryption; Magnetic heads; Read only memory; Shape measurement; Speech synthesis; Videos;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Face and Gesture Recognition, 2000. Proceedings. Fourth IEEE International Conference on

Conference_Location

Grenoble

Print_ISBN

0-7695-0580-5

Type

conf

DOI

10.1109/AFGR.2000.840633

Filename

840633