Rendering a personalized photo-real talking head from short video footage

Author

Wang, Lijuan ; Han, Wei ; Qian, Xiaojun ; Soong, Frank K.

Author_Institution

Microsoft Res. Asia, Beijing, China

fYear

2010

fDate

Nov. 29 2010-Dec. 3 2010

Firstpage

129

Lastpage

134

Abstract

In this paper, we propose an HMM trajectory-guided, real image sample concatenation approach to photo-real talking head synthesis. An audio-visual database of a person is recorded first for training a statistical Hidden Markov Model (HMM) of Lips movement. The HMM is then used to generate the dynamic trajectory of lips movement for given speech signals in the maximum probability sense. The generated trajectory is then used as a guide to select, from the original training database, an optimal sequence of lips images which are then stitched back to a background head video. The whole procedure is fully automatic and data driven. For as short as 20 minutes recording of audio/video footage, the proposed system can synthesize a highly photo-real talking head in sync with the given speech signals (natural or TTS synthesized). This system won the first place in the A/V consistency contest in LIPS Challenge(2009), perceptually evaluated by recruited human subjects.

Keywords

hidden Markov models; image sampling; image sequences; realistic images; rendering (computer graphics); speech synthesis; video signal processing; visual databases; audio visual database; image sample concatenation approach; image sequence; lip movement; maximum probability sense; photo real talking head synthesis; real time rendering; speech signals; video footage; Hidden Markov models; Lips; Magnetic heads; Training; Trajectory; Visualization; photo-real; talking head; trajectory-guided; visual speech synthesis;

fLanguage

English

Publisher

ieee

Conference_Titel

Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on

Conference_Location

Tainan

Print_ISBN

978-1-4244-6244-5

Type

conf

DOI

10.1109/ISCSLP.2010.5684834

Filename

5684834