DocumentCode :
312176
Title :
Articulatory synthesis from X-rays and inversion for an adaptive speech robot
Author :
Badin, Pierre ; Abry, Christian
Author_Institution :
Inst. de la Commun. Parlee, CNRS, Grenoble, France
Volume :
2
fYear :
1996
fDate :
3-6 Oct 1996
Firstpage :
1125
Abstract :
Describes a speech robotic approach to articulatory synthesis. An anthropomorphic speech robot has been built, based on a real reference subject´s data. This robot, called the Articulotron, has a set of relevant degrees of freedom for speech articulators, jaw, tongue, lips and larynx. The associated articulatory model has been elaborated from cineradiographic midsagittal profiles recorded in synchrony with front lip views; the model of the noise source for fricative excitation has been derived from acoustic and aerodynamic measurements on the same reference subject. In a first phase, the Articulotron has been used to perform the copy synthesis of the vowels, fricative and plosive consonants in the X-ray corpus. This allows one to assess the performance of the Articulotron in producing fairly high-quality speech, and provides a reference against which other attempts at articulatory synthesis can be compared. In a second phase, the Articulotron has be used to recover articulatory gestures from audio-visual speech prototypes. A gradient descent algorithm is used to learn the articulatory trajectories of the robot by optimisation, starting from the formant trajectories and the knowledge of constraints for the consonantal constriction or closure, in order to mimic the original VCV (vowel-consonant-vowel) audio-visual sequences. The adaptive skill of the robot is demonstrated through articulator perturbation experiments and through the elaboration of relevant strategies in the hyper/hypo-speech paradigm
Keywords :
X-ray applications; adaptive systems; optimisation; robots; speech synthesis; Articulotron; X-ray corpus; acoustic measurements; adaptive anthropomorphic speech robot; aerodynamic measurements; articulator perturbation; articulatory gesture recovery; articulatory synthesis; articulatory trajectory learning; audio-visual speech prototypes; cineradiographic midsagittal profiles; consonantal closure; consonantal constriction; copy synthesis; formant trajectories; fricative consonants; fricative excitation; gradient descent algorithm; high-quality speech; hyper-speech paradigm; hypo-speech paradigm; inversion; jaw; larynx; lips; noise source; optimization; plosive consonants; speech articulators; tongue; vowels; Acoustic measurements; Acoustic noise; Aerodynamics; Anthropomorphism; Larynx; Lips; Robots; Speech synthesis; Tongue; X-rays;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
Type :
conf
DOI :
10.1109/ICSLP.1996.607804
Filename :
607804
Link To Document :
بازگشت