• DocumentCode
    312176
  • Title

    Articulatory synthesis from X-rays and inversion for an adaptive speech robot

  • Author

    Badin, Pierre ; Abry, Christian

  • Author_Institution
    Inst. de la Commun. Parlee, CNRS, Grenoble, France
  • Volume
    2
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    1125
  • Abstract
    Describes a speech robotic approach to articulatory synthesis. An anthropomorphic speech robot has been built, based on a real reference subject´s data. This robot, called the Articulotron, has a set of relevant degrees of freedom for speech articulators, jaw, tongue, lips and larynx. The associated articulatory model has been elaborated from cineradiographic midsagittal profiles recorded in synchrony with front lip views; the model of the noise source for fricative excitation has been derived from acoustic and aerodynamic measurements on the same reference subject. In a first phase, the Articulotron has been used to perform the copy synthesis of the vowels, fricative and plosive consonants in the X-ray corpus. This allows one to assess the performance of the Articulotron in producing fairly high-quality speech, and provides a reference against which other attempts at articulatory synthesis can be compared. In a second phase, the Articulotron has be used to recover articulatory gestures from audio-visual speech prototypes. A gradient descent algorithm is used to learn the articulatory trajectories of the robot by optimisation, starting from the formant trajectories and the knowledge of constraints for the consonantal constriction or closure, in order to mimic the original VCV (vowel-consonant-vowel) audio-visual sequences. The adaptive skill of the robot is demonstrated through articulator perturbation experiments and through the elaboration of relevant strategies in the hyper/hypo-speech paradigm
  • Keywords
    X-ray applications; adaptive systems; optimisation; robots; speech synthesis; Articulotron; X-ray corpus; acoustic measurements; adaptive anthropomorphic speech robot; aerodynamic measurements; articulator perturbation; articulatory gesture recovery; articulatory synthesis; articulatory trajectory learning; audio-visual speech prototypes; cineradiographic midsagittal profiles; consonantal closure; consonantal constriction; copy synthesis; formant trajectories; fricative consonants; fricative excitation; gradient descent algorithm; high-quality speech; hyper-speech paradigm; hypo-speech paradigm; inversion; jaw; larynx; lips; noise source; optimization; plosive consonants; speech articulators; tongue; vowels; Acoustic measurements; Acoustic noise; Aerodynamics; Anthropomorphism; Larynx; Lips; Robots; Speech synthesis; Tongue; X-rays;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607804
  • Filename
    607804