Author/Authors :
SumedhaKshirsagar ، نويسنده , , NadiaMagnenat-Thalmann، نويسنده ,
Abstract :
Visemes are visual counterpart of phonemes. Traditionally, the speech animation of 3D synthetic faces involves
extraction of visemes from input speech followed by the application of co-articulation rules to generate realistic
animation. In this paper, we take a novel approach for speech animation – using visyllables, the visual counterpart
of syllables. The approach results into a concatenative visyllable based speech animation system. The key contribution
of this paper lies in two main areas. Firstly, we define a set of visyllable units for spoken English along with
the associated phonological rules for valid syllables. Based on these rules, we have implemented a syllabification
algorithm that allows segmentation of a given phoneme stream into syllables and subsequently visyllables. Secondly,
we have recorded the database of visyllables using a facial motion capture system. The recorded visyllable
units are post-processed semi-automatically to ensure continuity at the vowel boundaries of the visyllables. We define
each visyllable in terms of the Facial Movement Parameters (FMP). The FMPs are obtained as a result of the
statistical analysis of the facial motion capture data. The FMPs allow a compact representation of the visyllables.
Further, the FMPs also facilitate the formulation of rules for boundary matching and smoothing after concatenating
the visyllables units. Ours is the first visyllable based speech animation system. The proposed technique is
easy to implement, effective for real-time as well as non real-time applications and results into realistic speech
animation.
Categories and