Title :
Acoustic speech to lip feature mapping for multimedia applications
Author :
Li, Chengliang ; Dansereau, Richard M. ; Goubran, Rafk A.
Author_Institution :
Dept. of Syst. & Comput. Eng., Carleton Univ., Ottawa, Ont., Canada
Abstract :
This paper presents a quantitative analysis of the relationship between acoustic speech and corresponding lip features. The lip features, such as the lip width, inner lip height and outer lip height, are acquired by an extraction algorithm combining both color and edge information within a Markov random field (MRF) framework. Meanwhile, LSP (linear spectrum pairs) coefficients are used to parameterize the acoustic speech. LSP coefficients and the lip features are then used to train mapping models. The resulting models are used to estimate lip features from acoustic speech. From the results, we can see that the measured lip features match fairly well with the estimated lip features. The correlation coefficients between measured and estimated lip features are as high as 0.90. The estimation technique of lip features from acoustic speech gives a way to integrate acoustic and visual speech, which is very useful for speech driven face animation, audio-video synchronization and foreign film dubbing.
Keywords :
Markov processes; feature extraction; gesture recognition; multimedia systems; neural nets; speech processing; synchronisation; video signal processing; LSP; MRF; Markov random field; acoustic speech; audio-video synchronization; extraction algorithm; foreign film dubbing; linear spectrum pairs; lip feature mapping; multimedia applications; quantitative analysis; speech driven face animation; Acoustic applications; Acoustic measurements; Acoustical engineering; Application software; Data mining; Facial animation; Feature extraction; Lips; Loudspeakers; Speech analysis;
Conference_Titel :
Image and Signal Processing and Analysis, 2003. ISPA 2003. Proceedings of the 3rd International Symposium on
Print_ISBN :
953-184-061-X
DOI :
10.1109/ISPA.2003.1296393