DocumentCode :
2039470
Title :
Mining audio/visual database for speech driven face animation
Author :
Chen, Yiqiang ; Gao, Wen ; Wang, Zhaoqi ; Miao, Jun ; Jiang, Dalong
Author_Institution :
Inst. of Comput. Technol., Acad. Sinica, Beijing, China
Volume :
4
fYear :
2001
fDate :
2001
Firstpage :
2638
Abstract :
The authors present a data mining framework in audio-visual interaction, and apply it to speech driven lip motion facial animation system. First, an unsupervised cluster algorithm is proposed to build a set of clusters in which each has similar configurations. Then, a statistical visual model is constructed by specifying all the possible cluster trajectories. The audio is analyzed with regard to learned clusters of facial gesture. For every cluster, two neural networks are trained to build mapping from audio features to cluster label and velocity respectively. Given a new vocal track, the statistical visual model and neural networks are combined together to analyze control audio, resulting in a most likely facial state sequence. The proposed method not only automatically incorporates vocal and facial dynamics such as co-articulation, but also is characterized by easy training, and being more robust, extensible and interpretable. Two approaches for an evaluation test are also proposed. The performance of our system shows that the proposed learning algorithm is suitable, which greatly improves the realism of face animation during speech
Keywords :
audio signal processing; computer animation; data mining; learning (artificial intelligence); multimedia databases; neural nets; articulation; audio features; audio-visual interaction; audio/visual database mining; cluster trajectories; control audio; data mining framework; evaluation test; face animation; facial gesture; facial state sequence; learning algorithm; lip-syncing; neural networks; speech driven face animation; speech driven lip motion facial animation system; statistical visual model; unsupervised cluster algorithm; vocal track; Audio databases; Automatic control; Clustering algorithms; Data mining; Facial animation; Neural networks; Robustness; Speech; Testing; Visual databases;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man, and Cybernetics, 2001 IEEE International Conference on
Conference_Location :
Tucson, AZ
ISSN :
1062-922X
Print_ISBN :
0-7803-7087-2
Type :
conf
DOI :
10.1109/ICSMC.2001.972962
Filename :
972962
Link To Document :
بازگشت