مرکز منطقه ای اطلاع رساني علوم و فناوري - Geometrical and Pixel Based Lip Feature Fusion in Speech Synthesis System Driven by Visual-speech

DocumentCode :

534937

Title :

Geometrical and Pixel Based Lip Feature Fusion in Speech Synthesis System Driven by Visual-speech

Author :

Mengjun, Wang

Author_Institution :

Sch. of Inf. Eng., HeBei Univ. of Technol., Tianjin, China

Volume :

fYear :

2010

fDate :

13-14 Sept. 2010

Firstpage :

149

Lastpage :

153

Abstract :

Lipreading is applied to synthesize speech for the speech-impaired people. To get a higher recognition result, data fusion with weighting coefficients at feature level is used to integrate the lip information from different kinds of lip features. Experiments are carried out based on HMM with different states and Gaussian mixture component in a small database for speaker-dependent case. From the recognition results, the most important conclusion that can be drawn is that, the integrated discriminate vector after feature fusion outperforms than geometrical features vector only, DCT descriptors vector only and DCT coefficients vector only with 4 states and 16 Gaussian mixture component HMM. And compare with the geometrical features vector and DCT descriptors cascaded method, the geometrical features vector and DCT coefficients cascaded method integrates more information of lip region, and the recognition rate is improved by as much as 3.18% with best weighting coefficients (m: n=1.5:1).

Keywords :

Gaussian processes; discrete cosine transforms; feature extraction; hidden Markov models; image fusion; image sequences; speech synthesis; Gaussian mixture; data fusion; discrete cosine transforms; discriminate vector; geometrical based lip feature fusion; hidden Markov models; lipreading; pixel based lip feature fusion; speech synthesis system; speech-impaired people; visual-speech; weighting coefficients; Acoustics; Adaptation model; Computational modeling; Discrete cosine transforms; Educational institutions; Hidden Markov models;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computational Intelligence and Natural Computing Proceedings (CINC), 2010 Second International Conference on

Conference_Location :

Wuhan

Print_ISBN :

978-1-4244-7705-0

Type :

conf

DOI :

10.1109/CINC.2010.5643872

Filename :

5643872

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=534937