DocumentCode :
2348774
Title :
Performance Analysis of Lip Synchronization Using LPC, MFCC and PLP Speech Parameters
Author :
Goyani, Mahesh ; Dave, Namrata ; Patel, N.M.
Author_Institution :
Dept. of Comput. Eng., SP Univ., Nagar, India
fYear :
2010
fDate :
26-28 Nov. 2010
Firstpage :
582
Lastpage :
587
Abstract :
Many multimedia applications and entertainment industry products like games, cartoons and film dubbing require speech driven face animation and audio-video synchronization. Only Automatic Speech Recognition system (ASR) does not give good results in noisy environment. Audio Visual Speech Recognition system plays vital role in such harsh environment as it uses both - audio and visual - information. In this paper, we have proposed a novel approach with enhanced performance over traditional methods that have been reported so far. Our algorithm works on the bases of acoustic and visual parameters to achieve better results. We have tested our system for English language using LPC, MFCC and PLP parameters of the speech. Lip parameters like lip width, lip height etc are extracted from the video and these both acoustic and visual parameters are used to train systems like Artificial Neural Network (ANN), Vector Quantization (VQ), Dynamic Time Warping (DTW), Support Vector Machine (SVM). We have employed neural network in our research work with LPC, MFCC and PLP parameters. Results show that our system is giving very good response against tested vowels.
Keywords :
linear predictive coding; neural nets; speech coding; speech recognition; English language; LPC speech parameters; MFCC speech parameters; PLP speech parameters; acoustic parameters; artificial neural network; audio visual speech recognition system; audio-video synchronization; automatic speech recognition system; cartoons; dynamic time warping; entertainment industry products; film dubbing; games; linear predictive codes; lip height; lip parameters; lip synchronization; lip width; multimedia applications; noisy environment; performance analysis; speech driven face animation; support vector machine; train systems; vector quantization; visual parameters; Automatic Speech Recognition; Neural Network; Phoneme; Speech Parameter; Viseme;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Communication Networks (CICN), 2010 International Conference on
Conference_Location :
Bhopal
Print_ISBN :
978-1-4244-8653-3
Electronic_ISBN :
978-0-7695-4254-6
Type :
conf
DOI :
10.1109/CICN.2010.115
Filename :
5702038
Link To Document :
بازگشت