DocumentCode
2303968
Title
Speech Driven 3D Head Gesture Synthesis
Author
Sargin, M.E. ; Erzin, E. ; Yemez, Y. ; Tekalp, A.M. ; Erdem, Arif Tanju
Author_Institution
Cokluortam, Goru ve Grafik Lab., Koc Univ., Istanbul
fYear
2006
fDate
17-19 April 2006
Firstpage
1
Lastpage
4
Abstract
In this paper, we present a speech driven natural head gesture analysis and synthesis system. The proposed system assumes that sharp head movements are correlated with prominence in speech. For analysis, a binocular camera system is employed to capture the head motion of a talking person. The motion parameters associated with the 3D head motion are then used for extraction of the repetitive head gestures. In parallel, prosodic events are detected using an HMM structure with pitch and formant frequencies and speech intensity as audio features. For synthesis, the head motion parameters are estimated from the prosodic events based on a gesture-speech correlation model and then the associated Euler angles are used for speech driven animation of a 3D personalized talking head model. Results on head motion feature extraction, prosodic event detection and correlation modelling are provided
Keywords
audio signal processing; feature extraction; gesture recognition; hidden Markov models; motion estimation; speech recognition; speech synthesis; video cameras; 3D head gesture synthesis; HMM structure; audio feature extraction; binocular camera system; gesture-speech correlation model; head motion parameter estimation; hidden Markov model; prosodic event detection; speech driven animation; speech intensity; Animation; Cameras; Event detection; Frequency; Hidden Markov models; Motion analysis; Motion estimation; Parameter estimation; Speech analysis; Speech synthesis;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing and Communications Applications, 2006 IEEE 14th
Conference_Location
Antalya
Print_ISBN
1-4244-0238-7
Type
conf
DOI
10.1109/SIU.2006.1659683
Filename
1659683
Link To Document