DocumentCode :
534937
Title :
Geometrical and Pixel Based Lip Feature Fusion in Speech Synthesis System Driven by Visual-speech
Author :
Mengjun, Wang
Author_Institution :
Sch. of Inf. Eng., HeBei Univ. of Technol., Tianjin, China
Volume :
1
fYear :
2010
fDate :
13-14 Sept. 2010
Firstpage :
149
Lastpage :
153
Abstract :
Lipreading is applied to synthesize speech for the speech-impaired people. To get a higher recognition result, data fusion with weighting coefficients at feature level is used to integrate the lip information from different kinds of lip features. Experiments are carried out based on HMM with different states and Gaussian mixture component in a small database for speaker-dependent case. From the recognition results, the most important conclusion that can be drawn is that, the integrated discriminate vector after feature fusion outperforms than geometrical features vector only, DCT descriptors vector only and DCT coefficients vector only with 4 states and 16 Gaussian mixture component HMM. And compare with the geometrical features vector and DCT descriptors cascaded method, the geometrical features vector and DCT coefficients cascaded method integrates more information of lip region, and the recognition rate is improved by as much as 3.18% with best weighting coefficients (m: n=1.5:1).
Keywords :
Gaussian processes; discrete cosine transforms; feature extraction; hidden Markov models; image fusion; image sequences; speech synthesis; Gaussian mixture; data fusion; discrete cosine transforms; discriminate vector; geometrical based lip feature fusion; hidden Markov models; lipreading; pixel based lip feature fusion; speech synthesis system; speech-impaired people; visual-speech; weighting coefficients; Acoustics; Adaptation model; Computational modeling; Discrete cosine transforms; Educational institutions; Hidden Markov models;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Natural Computing Proceedings (CINC), 2010 Second International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-7705-0
Type :
conf
DOI :
10.1109/CINC.2010.5643872
Filename :
5643872
Link To Document :
بازگشت