مرکز منطقه ای اطلاع رساني علوم و فناوري - Performance Analysis of Lip Synchronization Using LPC, MFCC and PLP Speech Parameters

DocumentCode :

2348774

Title :

Performance Analysis of Lip Synchronization Using LPC, MFCC and PLP Speech Parameters

Author :

Goyani, Mahesh ; Dave, Namrata ; Patel, N.M.

Author_Institution :

Dept. of Comput. Eng., SP Univ., Nagar, India

fYear :

2010

fDate :

26-28 Nov. 2010

Firstpage :

582

Lastpage :

587

Abstract :

Many multimedia applications and entertainment industry products like games, cartoons and film dubbing require speech driven face animation and audio-video synchronization. Only Automatic Speech Recognition system (ASR) does not give good results in noisy environment. Audio Visual Speech Recognition system plays vital role in such harsh environment as it uses both - audio and visual - information. In this paper, we have proposed a novel approach with enhanced performance over traditional methods that have been reported so far. Our algorithm works on the bases of acoustic and visual parameters to achieve better results. We have tested our system for English language using LPC, MFCC and PLP parameters of the speech. Lip parameters like lip width, lip height etc are extracted from the video and these both acoustic and visual parameters are used to train systems like Artificial Neural Network (ANN), Vector Quantization (VQ), Dynamic Time Warping (DTW), Support Vector Machine (SVM). We have employed neural network in our research work with LPC, MFCC and PLP parameters. Results show that our system is giving very good response against tested vowels.

Keywords :

linear predictive coding; neural nets; speech coding; speech recognition; English language; LPC speech parameters; MFCC speech parameters; PLP speech parameters; acoustic parameters; artificial neural network; audio visual speech recognition system; audio-video synchronization; automatic speech recognition system; cartoons; dynamic time warping; entertainment industry products; film dubbing; games; linear predictive codes; lip height; lip parameters; lip synchronization; lip width; multimedia applications; noisy environment; performance analysis; speech driven face animation; support vector machine; train systems; vector quantization; visual parameters; Automatic Speech Recognition; Neural Network; Phoneme; Speech Parameter; Viseme;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computational Intelligence and Communication Networks (CICN), 2010 International Conference on

Conference_Location :

Bhopal

Print_ISBN :

978-1-4244-8653-3

Electronic_ISBN :

978-0-7695-4254-6

Type :

conf

DOI :

10.1109/CICN.2010.115

Filename :

5702038

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2348774