• DocumentCode
    3269784
  • Title

    Accurate visual speech synthesis based on diviseme unit selection and concatenation

  • Author

    Jiang, Dongmei ; Ravyse, Ilse ; Sahli, Hichem ; Zhang, Yanning

  • Author_Institution
    Joint Res. Group on Audio Visual Signal Process., Northwestern Polytech. Univ., Xian
  • fYear
    2008
  • fDate
    8-10 Oct. 2008
  • Firstpage
    906
  • Lastpage
    909
  • Abstract
    This paper presents a novel speech driven accurate realistic visual speech synthesis approach. Firstly, an audio visual instance database is built for different viseme context combinations, i.e. diviseme units, using 100 audio visual speech sentences of a female speaker. Then a diviseme instance selection algorithm is introduced to choose the optimal diviseme instances for the viseme contexts in the input speech, considering both the concatenation smoothness of the image sequences, and matching of the mouth movements to the acoustic pronunciation process, as well the intensity of the input speech. Finally mouth image sequences of corresponding viseme segments in the selected diviseme instances are time warped and blended to construct the mouth images of the final animation. Visual speech synthesis experiments and subjective evaluation results show that mouth animations can be obtained which are not only realistic with clear and smooth mouth images, but also in good accordance with the acoustic pronunciation and intensity of the input speech.
  • Keywords
    image matching; image motion analysis; image segmentation; image sequences; speech synthesis; visual communication; acoustic pronunciation process; audio visual instance database; diviseme instance selection algorithm; image sequences; mouth animations; mouth movements matching; visual speech synthesis approach; Animation; Audio databases; Image databases; Image segmentation; Image sequences; Loudspeakers; Mouth; Speech processing; Speech synthesis; Visual databases;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia Signal Processing, 2008 IEEE 10th Workshop on
  • Conference_Location
    Cairns, Qld
  • Print_ISBN
    978-1-4244-2294-4
  • Electronic_ISBN
    978-1-4244-2295-1
  • Type

    conf

  • DOI
    10.1109/MMSP.2008.4665203
  • Filename
    4665203