• DocumentCode
    1464121
  • Title

    Live Speech Driven Head-and-Eye Motion Generators

  • Author

    Le, Binh H. ; Ma, Xiaohan ; Deng, Zhigang

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Houston, Houston, TX, USA
  • Volume
    18
  • Issue
    11
  • fYear
    2012
  • Firstpage
    1902
  • Lastpage
    1914
  • Abstract
    This paper describes a fully automated framework to generate realistic head motion, eye gaze, and eyelid motion simultaneously based on live (or recorded) speech input. Its central idea is to learn separate yet interrelated statistical models for each component (head motion, gaze, or eyelid motion) from a prerecorded facial motion data set: 1) Gaussian Mixture Models and gradient descent optimization algorithm are employed to generate head motion from speech features; 2) Nonlinear Dynamic Canonical Correlation Analysis model is used to synthesize eye gaze from head motion and speech features, and 3) nonnegative linear regression is used to model voluntary eye lid motion and log-normal distribution is used to describe involuntary eye blinks. Several user studies are conducted to evaluate the effectiveness of the proposed speech-driven head and eye motion generator using the well-established paired comparison methodology. Our evaluation results clearly show that this approach can significantly outperform the state-of-the-art head and eye motion generation algorithms. In addition, a novel mocap+video hybrid data acquisition technique is introduced to record high-fidelity head movement, eye gaze, and eyelid motion simultaneously.
  • Keywords
    Gaussian processes; computer animation; data acquisition; eye; face recognition; gradient methods; image motion analysis; log normal distribution; optimisation; realistic images; statistical analysis; video signal processing; Gaussian mixture models; eye gaze generation; eye gaze recording; eye gaze synthesis; eyelid motion generation; eyelid motion recording; facial animation; facial motion data set; fully automated framework; gradient descent optimization algorithm; high-fidelity head movement recording; live speech driven head-and-eye motion generators; live speech input; log-normal distribution; mocap+video hybrid data acquisition technique; nonlinear dynamic canonical correlation analysis model; nonnegative linear regression; realistic head motion generation; speech features; statistical models; voluntary eye lid motion model; Data acquisition; Hidden Markov models; Humans; Magnetic heads; Speech; Synchronization; Facial animation; and live speech driven; blinking model; gaze synthesis; head and eye motion coupling; head motion synthesis;
  • fLanguage
    English
  • Journal_Title
    Visualization and Computer Graphics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1077-2626
  • Type

    jour

  • DOI
    10.1109/TVCG.2012.74
  • Filename
    6165277