مرکز منطقه ای اطلاع رساني علوم و فناوري - Synthesizing visual speech trajectory with minimum generation error

DocumentCode :

2174513

Title :

Synthesizing visual speech trajectory with minimum generation error

Author :

Wang, Lijuan ; Wu, Yi-Jian ; Zhuang, Xiaodan ; Soong, Frank K.

Author_Institution :

Microsoft Res. Asia, Microsoft Corp., Beijing, China

fYear :

2011

fDate :

22-27 May 2011

Firstpage :

4580

Lastpage :

4583

Abstract :

In this paper, we propose a minimum generation error (MGE) training method to refine the audio-visual HMM to improve visual speech trajectory synthesis. Compared with the traditional maximum likelihood (ML) estimation, the proposed MGE training explicitly optimizes the quality of generated visual speech trajectory, where the audio-visual HMM modeling is jointly refined by using a heuristic method to find the optimal state alignment and a probabilistic descent algorithm to optimize the model parameters under the MGE criterion. In objective evaluation, compared with the ML-based method, the proposed MGE-based method achieves consistent improvement in the mean square error reduction, correlation increase, and recovery of global variance. It also improves the naturalness and audio-visual consistency perceptually in the subjective test.

Keywords :

hidden Markov models; mean square error methods; speech synthesis; audiovisual HMM; heuristic method; mean square error reduction; minimum generation error training method; optimal state alignment; probabilistic descent algorithm; traditional maximum likelihood estimation; visual speech trajectory synthesis; Acoustics; Hidden Markov models; Speech; Speech synthesis; Training; Trajectory; Visualization; minimum generation error; photo-real; talking head; trajectory-guided; visual speech synthesis;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on

Conference_Location :

Prague

ISSN :

1520-6149

Print_ISBN :

978-1-4577-0538-0

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2011.5947374

Filename :

5947374

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2174513