Title :
Learning and synthesizing MPEG-4 compatible 3-D face animation from video sequence
Author :
Gao, Wen ; Chen, Yiqiang ; Wang, Rui ; Shan, Shiguang ; Jiang, Dalong
Author_Institution :
Inst. of Comput. Technol., Chinese Acad. of Sci., Beijing, China
Abstract :
We present a new system that applies an example-based learning method to learn facial motion patterns from a video sequence of individual facial behavior such as lip motion and facial expressions, and using that to create vivid three-dimensional (3-D) face animation according to the definition of MPEG-4 face animation parameters. The system consists of three key modules, face tracking, pattern learning, and face animation. In face tracking, to reduce the complexity of the tracking process, a novel coarse-to-fine strategy combined with a Kalman filter is proposed for localizing key facial landmarks in each image of the video. The landmarks´ sequence is normalized into a visual feature matrix and then fed to the next step of process. In pattern learning, in the pretraining stage, the parameters of the camera that took the video are requested with the training video data so the system can estimate the basic mapping from a normalized two-dimensional (2-D) visual feature matrix to the representation in 3-D MPEG-4 face animation parameter space, in assistance with the computer vision method. In the practice stage, considering that in most cases camera parameters are not provided with video data, the system uses machine learning technology to complement the incomplete 3-D information for the mapping that information is needed in face orientation presentation. The example-based learning in this system integrates several methods including clustering, HMM, and ANN to make a better conversion from a 2-D to 3-D model and better estimation of incomplete 3-D information for good mapping; this will be used to drive face animation thereafter. In face animation, the system can synthesize face animation following any type of face motion in video. Experiments show that our system produces more vivid face motion animation, compared to other early systems.
Keywords :
Kalman filters; code standards; computer animation; data compression; feature extraction; hidden Markov models; image motion analysis; learning (artificial intelligence); neural nets; pattern clustering; telecommunication standards; video coding; ANN; HMM; Kalman filter; MPEG-4 compatible 3D face animation; MPEG-4 face animation parameters; camera parameters; clustering; complexity reduction; computer vision method; example-based learning method; face animation module; face orientation presentation; face tracking module; facial behavior; facial expressions; facial landmarks; facial motion patterns; feature extraction; lip motion; machine learning technology; normalized 2D visual feature matrix; normalized two-dimensional visual feature matrix; pattern learning module; three-dimensional face animation; training video data; video sequence; Cameras; Computer vision; Face detection; Facial animation; Hidden Markov models; Learning systems; MPEG 4 Standard; Space technology; Two dimensional displays; Video sequences;
Journal_Title :
Circuits and Systems for Video Technology, IEEE Transactions on
DOI :
10.1109/TCSVT.2003.817629