DocumentCode :
153596
Title :
Towards realizing gesture-to-speech conversion with a HMM-based bilingual speech synthesis system
Author :
Hongwu Yang ; Xiaochun An ; Dong Pei ; Yitong Liu
Author_Institution :
Coll. of Phys. & Electron. Eng., Northwest Normal Univ., Lanzhou, China
fYear :
2014
fDate :
20-23 Sept. 2014
Firstpage :
97
Lastpage :
100
Abstract :
This paper realizes a gesture-to-speech conversion system to solve the communication problem between healthy people and speech disorders. An improved speeded up robust features (SURF) algorithm is adopted for static gesture recognition by combining Kinect sensor. Meanwhile, a Hidden Markov Model (HMM) based Mandarin-Tibetan bilingual speech synthesis system is developed by using speaker adaptive training. A set of semantic rules is designed for the static gestures. Chinese or Tibetan context-dependent labels of recognized static gestures are generated according to the semantic rules. The recognized gestures are finally converted to the Mandarin or Tibetan by using the Mandarin-Tibetan bilingual speech synthesis system with the context-dependent labels. Tests show that the static gesture recognition rate of the designed system achieves 97.1%. Subjective evaluation demonstrates that synthesized speech can get 4.0 of the mean opinion score (MOS) on synthesized speech.
Keywords :
feature extraction; gesture recognition; handicapped aids; hidden Markov models; natural language processing; sensors; speaker recognition; speech synthesis; Chinese context-dependent labels; HMM based Mandarin-Tibetan bilingual speech synthesis system; Kinect sensor; MOS; SURF algorithm; Tibetan context-dependent labels; communication problem; gesture-to-speech conversion system; healthy people; hidden Markov model; mean opinion score; semantic rules; speaker adaptive training; speech disorders; speeded up robust features algorithm; static gesture recognition; synthesized speech; Assistive technology; Gesture recognition; Hidden Markov models; Semantics; Speech; Speech recognition; Speech synthesis; Kinect sensor; Mandarin-Tibetan bilingual speech synthe-sis; context-dependent label; hidden Markov Model; improved SURF algorithm; speech synthesis; static gesture recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Orange Technologies (ICOT), 2014 IEEE International Conference on
Conference_Location :
Xian
Type :
conf
DOI :
10.1109/ICOT.2014.6956608
Filename :
6956608
Link To Document :
بازگشت