• DocumentCode
    1453829
  • Title

    A Statistical Quality Model for Data-Driven Speech Animation

  • Author

    Ma, Xiaohan ; Deng, Zhigang

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Houston, Houston, TX, USA
  • Volume
    18
  • Issue
    11
  • fYear
    2012
  • Firstpage
    1915
  • Lastpage
    1927
  • Abstract
    In recent years, data-driven speech animation approaches have achieved significant successes in terms of animation quality. However, how to automatically evaluate the realism of novel synthesized speech animations has been an important yet unsolved research problem. In this paper, we propose a novel statistical model (called SAQP) to automatically predict the quality of on-the-fly synthesized speech animations by various data-driven techniques. Its essential idea is to construct a phoneme-based, Speech Animation Trajectory Fitting (SATF) metric to describe speech animation synthesis errors and then build a statistical regression model to learn the association between the obtained SATF metric and the objective speech animation synthesis quality. Through delicately designed user studies, we evaluate the effectiveness and robustness of the proposed SAQP model. To the best of our knowledge, this work is the first-of-its-kind, quantitative quality model for data-driven speech animation. We believe it is the important first step to remove a critical technical barrier for applying data-driven speech animation techniques to numerous online or interactive talking avatar applications.
  • Keywords
    computer animation; regression analysis; speech processing; speech synthesis; SAQP; SATF; animation quality; data-driven speech animation approach; data-driven techniques; interactive talking avatar applications; novel statistical model; on-the-fly synthesized speech animations; speech animation trajectory fitting metric; statistical quality model; statistical regression model; Animation; Face; Measurement; Predictive models; Principal component analysis; Speech; Trajectory; Facial animation; data-driven; lip-sync; quality prediction; statistical models; visual speech animation;
  • fLanguage
    English
  • Journal_Title
    Visualization and Computer Graphics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1077-2626
  • Type

    jour

  • DOI
    10.1109/TVCG.2012.67
  • Filename
    6155718