مرکز منطقه ای اطلاع رساني علوم و فناوري - Transferring of Speech Movements from Video to 3D Face Space

DocumentCode :

831357

Title :

Transferring of Speech Movements from Video to 3D Face Space

Author :

Pei, Yuru ; Zha, Hongbin

Author_Institution :

Nat. Lab. on Machine Perception, Peking Univ., Beijing

Volume :

Issue :

fYear :

2007

Firstpage :

Lastpage :

Abstract :

We present a novel method for transferring speech animation recorded in low quality videos to high resolution 3D face models. The basic idea is to synthesize the animated faces by an interpolation based on a small set of 3D key face shapes which span a 3D face space. The 3D key shapes are extracted by an unsupervised learning process in 2D video space to form a set of 2D visemes which are then mapped to the 3D face space. The learning process consists of two main phases: 1) isomap-based nonlinear dimensionality reduction to embed the video speech movements into a low-dimensional manifold and 2) k-means clustering in the low-dimensional space to extract 2D key viseme frames. Our main contribution is that we use the isomap-based learning method to extract intrinsic geometry of the speech video space and thus to make it possible to define the 3D key viseme shapes. To do so, we need only to capture a limited number of 3D key face models by using a general 3D scanner. Moreover, we also develop a skull movement recovery method based on simple anatomical structures to enhance 3D realism in local mouth movements. Experimental results show that our method can achieve realistic 3D animation effects with a small number of 3D key face models

Keywords :

computer animation; face recognition; feature extraction; pattern clustering; solid modelling; speech processing; speech synthesis; unsupervised learning; video signal processing; 2D key viseme frame extraction; 3D key face models; 3D key shape extraction; 3D scanner; animated face synthesis; interpolation; isomap-based learning method; isomap-based nonlinear dimensionality reduction; k-means clustering; realistic 3D animation effects; skull movement recovery method; speech animation transferring; speech movements; unsupervised learning process; Anatomical structure; Facial animation; Geometry; Interpolation; Learning systems; Shape; Skull; Speech processing; Speech synthesis; Unsupervised learning; Facial animation; machine learning.; performance-driven animation; speech synchronization; visual speech synthesis; Algorithms; Computer Graphics; Face; Humans; Image Enhancement; Image Interpretation, Computer-Assisted; Imaging, Three-Dimensional; Information Storage and Retrieval; Movement; Speech; Video Recording;

fLanguage :

English

Journal_Title :

Visualization and Computer Graphics, IEEE Transactions on

Publisher :

ieee

ISSN :

1077-2626

Type :

jour

DOI :

10.1109/TVCG.2007.22

Filename :

4015398

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=831357