Title :
Real-time speech-driven lip synchronization
Author :
Mu, Kaihui ; Tao, Jianhua ; Che, Jianfeng ; Yang, Minghao
Author_Institution :
Nat. Lab. of Pattern Recognition (NLPR), Chinese Acad. of Sci., Beijing, China
Abstract :
Speech-driven lip synchronization, an important part of facial animation, is to animate a face model to render lip movements that are synchronized with the acoustic speech signal. It has many applications in human-computer interaction. In this paper, we present a framework that systematically addresses multimodal database collection and processing and real-time speech-driven lip synchronization using collaborative filtering which is a data-driven approach used by many online retailers to recommend products. Mel-frequency cepstral coefficients (MFCCs) with their delta and acceleration coefficients and Facial Animation Parameters (FAPs) supported by MPEG-4 for the visual representation of speech are utilized as acoustic features and animation parameters respectively. The proposed system is speaker independent and real-time capable. The subjective experiments show that the proposed approach generates a natural facial animation.
Keywords :
cepstral analysis; computer animation; human computer interaction; rendering (computer graphics); speech processing; MPEG-4; Mel-frequency cepstral coefficients; acoustic speech signal; collaborative filtering; data-driven approach; facial animation parameter; human-computer interaction; lip movement rendering; multimodal database collection; real-time speech-driven lip synchronization processing; speech visual representation; Acoustics; Face; Hidden Markov models; Speech; Synchronization; Transform coding; Visualization; FAP; MFCC; collaborative filtering; real-time speech-driven lip synchronization;
Conference_Titel :
Universal Communication Symposium (IUCS), 2010 4th International
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-7821-7
DOI :
10.1109/IUCS.2010.5666250