DocumentCode
2016560
Title
Rendering a personalized photo-real talking head from short video footage
Author
Wang, Lijuan ; Han, Wei ; Qian, Xiaojun ; Soong, Frank K.
Author_Institution
Microsoft Res. Asia, Beijing, China
fYear
2010
fDate
Nov. 29 2010-Dec. 3 2010
Firstpage
129
Lastpage
134
Abstract
In this paper, we propose an HMM trajectory-guided, real image sample concatenation approach to photo-real talking head synthesis. An audio-visual database of a person is recorded first for training a statistical Hidden Markov Model (HMM) of Lips movement. The HMM is then used to generate the dynamic trajectory of lips movement for given speech signals in the maximum probability sense. The generated trajectory is then used as a guide to select, from the original training database, an optimal sequence of lips images which are then stitched back to a background head video. The whole procedure is fully automatic and data driven. For as short as 20 minutes recording of audio/video footage, the proposed system can synthesize a highly photo-real talking head in sync with the given speech signals (natural or TTS synthesized). This system won the first place in the A/V consistency contest in LIPS Challenge(2009), perceptually evaluated by recruited human subjects.
Keywords
hidden Markov models; image sampling; image sequences; realistic images; rendering (computer graphics); speech synthesis; video signal processing; visual databases; audio visual database; image sample concatenation approach; image sequence; lip movement; maximum probability sense; photo real talking head synthesis; real time rendering; speech signals; video footage; Hidden Markov models; Lips; Magnetic heads; Training; Trajectory; Visualization; photo-real; talking head; trajectory-guided; visual speech synthesis;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
Conference_Location
Tainan
Print_ISBN
978-1-4244-6244-5
Type
conf
DOI
10.1109/ISCSLP.2010.5684834
Filename
5684834
Link To Document