• DocumentCode
    2174513
  • Title

    Synthesizing visual speech trajectory with minimum generation error

  • Author

    Wang, Lijuan ; Wu, Yi-Jian ; Zhuang, Xiaodan ; Soong, Frank K.

  • Author_Institution
    Microsoft Res. Asia, Microsoft Corp., Beijing, China
  • fYear
    2011
  • fDate
    22-27 May 2011
  • Firstpage
    4580
  • Lastpage
    4583
  • Abstract
    In this paper, we propose a minimum generation error (MGE) training method to refine the audio-visual HMM to improve visual speech trajectory synthesis. Compared with the traditional maximum likelihood (ML) estimation, the proposed MGE training explicitly optimizes the quality of generated visual speech trajectory, where the audio-visual HMM modeling is jointly refined by using a heuristic method to find the optimal state alignment and a probabilistic descent algorithm to optimize the model parameters under the MGE criterion. In objective evaluation, compared with the ML-based method, the proposed MGE-based method achieves consistent improvement in the mean square error reduction, correlation increase, and recovery of global variance. It also improves the naturalness and audio-visual consistency perceptually in the subjective test.
  • Keywords
    hidden Markov models; mean square error methods; speech synthesis; audiovisual HMM; heuristic method; mean square error reduction; minimum generation error training method; optimal state alignment; probabilistic descent algorithm; traditional maximum likelihood estimation; visual speech trajectory synthesis; Acoustics; Hidden Markov models; Speech; Speech synthesis; Training; Trajectory; Visualization; minimum generation error; photo-real; talking head; trajectory-guided; visual speech synthesis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
  • Conference_Location
    Prague
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4577-0538-0
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2011.5947374
  • Filename
    5947374