Title :
Lip feature fusion in speech synthesis system driven by visual-speech for speech impaired
Author :
Wang, Mengjun ; Li, Gang
Author_Institution :
Sch. of Inf. Eng., HeBei Univ. of Technol., Tianjin, China
Abstract :
Lipreading is applied to synthesize speech for the speech-impaired people. To get a higher recognition result, concatenative feature fusion with weighting coefficients is used to integrate the geometrical feature vector of lip region and the descriptors of lip contours by Discrete Cosine Transform to get a new discriminate vector. Experiments are carried out based on HMM with different states and Gaussian mixture component in a small database for speaker-dependent case. Experiment results showed that the integrated discriminate vector after feature fusion obtained the information from the Geometrical feature vector of lip region and the DCT descriptors of lip contours. With best weighting coefficients m: n=1.5:1, the recognition rate are improved by as much as 3.92% and 29.91%.
Keywords :
discrete cosine transforms; feature extraction; hidden Markov models; sensor fusion; speech recognition; speech synthesis; DCT descriptors; Gaussian mixture component; HMM; concatenative feature fusion; discrete cosine transform; geometrical feature vector; lip contours; lip feature fusion; lip reading; lip region; speaker-dependent case; speech impaired person; speech recognition; speech synthesis system; speech-impaired people; visual speech; weighting coefficients; Discrete cosine transforms; Feature extraction; Hidden Markov models; Image sequences; Speech; Speech recognition; Visualization; DCT descriptors; Hidden Markov Model; feature fusion; geometrical feature vector; weighting combination;
Conference_Titel :
Biomedical Engineering and Informatics (BMEI), 2010 3rd International Conference on
Conference_Location :
Yantai
Print_ISBN :
978-1-4244-6495-1
DOI :
10.1109/BMEI.2010.5639968