Cross-speaker viseme mapping using hidden Markov models

Author

Dong, Liang ; Foo, Say Wei ; Lian, Yong

Author_Institution

Dept. of Electr. & Comput. Eng., Nat. Univ. of Singapore, Singapore

Volume

3

fYear

2003

fDate

15-18 Dec. 2003

Firstpage

1384

Abstract

In this paper, a method of mapping visual speech between different speakers is proposed. This approach adopts hidden Markov model (HMM) to model the basic visual speech element - viseme. Some mapping terms are applied to associate the state chains decoded for the visemes produced by different speakers. The HMMs configured in this way are trained using the Baum-Welch estimation, and are used to generate new visemes. Experiments are conducted to map the visemes produced by several speakers to a destination speaker. The experimental results show that the proposed approach provides good accuracy and continuity for mapping the visemes.

Keywords

face recognition; gesture recognition; hidden Markov models; speaker recognition; video signal processing; Baum-Welch estimation; basic visual speech element; cross-speaker viseme mapping; hidden Markov model; state chains; visual speech mapping; Animation; Automatic speech recognition; Decoding; Games; Hidden Markov models; Loudspeakers; Performance analysis; Robustness; Speech processing; Wideband;

fLanguage

English

Publisher

ieee

Conference_Titel

Information, Communications and Signal Processing, 2003 and Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint Conference of the Fourth International Conference on

Print_ISBN

0-7803-8185-8

Type

conf

DOI

10.1109/ICICS.2003.1292692

Filename

1292692