• DocumentCode
    3167396
  • Title

    Improvement of animated articulatory gesture extracted from speech for pronunciation training

  • Author

    Iribe, Yurie ; Manosavan, Silasak ; Katsurada, Kouichi ; Hayashi, Ryoko ; Zhu, Chunyue ; Nitta, Tsuneo

  • Author_Institution
    Grad. Sch. of Eng., Toyohashi Univ. of Technol., Toyohashi, Japan
  • fYear
    2012
  • fDate
    25-30 March 2012
  • Firstpage
    5133
  • Lastpage
    5136
  • Abstract
    Computer-assisted pronunciation training (CAPT) was introduced for language education in recent years. CAPT scores the learner´s pronunciation quality and points out wrong phonemes by using speech recognition technology. However, although the learner can thus realize that his/her speech is different from the teacher´s, the learner still cannot control the articulation organs to pronounce correctly. The learner cannot understand how to correct the wrong articulatory gestures precisely. We indicate these differences by visualizing a learner´s wrong pronunciation movements and the correct pronunciation movements with CG animation. We propose a system for generating animated pronunciation by estimating a learner´s pronunciation movements from his/her speech automatically. The proposed system maps speech to coordinate values that are needed to generate the animations by using multilayer perceptron neural networks (MLP). We use MRI data to generate smooth animated pronunciations. Additionally, we verify whether the vocal tract area and articulatory features are suitable as characteristics of pronunciation movement through experimental evaluation.
  • Keywords
    biological tissues; biomedical MRI; computer aided instruction; computer animation; gesture recognition; linguistics; medical image processing; multilayer perceptrons; speech recognition; CAPT; CG animation; MLP; MRI data; animated articulatory gesture; articulation organ control; articulatory feature; articulatory gesture extraction; computer-assisted pronunciation training; language education; learner pronunciation quality; learner wrong pronunciation movement visualization; multilayer perceptron neural network; pronunciation animation; speech recognition technology; teacher; vocal tract area; wrong phonemes; Animation; Feature extraction; Magnetic resonance imaging; Speech; Training; Vectors; Articulatory Gesture; Pronunciation Animation; Pronunciation Training; Vocal Tract Area;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
  • Conference_Location
    Kyoto
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4673-0045-2
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2012.6289076
  • Filename
    6289076