Title :
HMM-based transmodal mapping from audio speech to talking faces
Author :
Nakamura, Satoshi
Author_Institution :
ATR Spoken Language Translation Res. Labs., Kyoto, Japan
Abstract :
Describes a transmodal mapping from audio speech to talking faces based on hidden Markov models (HMMs). If face movements are synthesized well enough for natural communication, a lot of benefits will be brought to human-machine communication. This paper describes an HMM-based speech-driven lip movement synthesis. The paper also describes its improvement by audio-visual joint estimation and its extension to talking face generation. The results of evaluation experiments show that the proposed method generates natural and accurate talking faces from audio speech inputs
Keywords :
audio-visual systems; computer animation; hidden Markov models; speech processing; speech-based user interfaces; HMM-based speech-driven lip movement synthesis; audio speech inputs; audio-visual joint estimation; face movement synthesis; hidden Markov models; human-machine communication; natural communication; talking face generation; transmodal mapping; Electronic mail; Face; Hidden Markov models; Image converters; Laboratories; Natural languages; Signal mapping; Signal synthesis; Speech processing; Speech synthesis;
Conference_Titel :
Neural Networks for Signal Processing X, 2000. Proceedings of the 2000 IEEE Signal Processing Society Workshop
Conference_Location :
Sydney, NSW
Print_ISBN :
0-7803-6278-0
DOI :
10.1109/NNSP.2000.889360