مرکز منطقه ای اطلاع رساني علوم و فناوري - Modular BDPCA based visual feature representation for lip-reading

DocumentCode :

1867458

Title :

Modular BDPCA based visual feature representation for lip-reading

Author :

Wu, Guanyong ; Zhu, Jie

Author_Institution :

Dept. of Electron. Eng., Shanghai Jiaotong Univ., Shanghai

fYear :

2008

fDate :

12-15 Oct. 2008

Firstpage :

1328

Lastpage :

1331

Abstract :

Most of the appearance based visual feature extraction methods in the lip-reading system treat the mouth image in a whole manner. However, the vision of speech process is three dimensional and treating the mouth image as a whole may lose the speech information. Motivated by the bidirectional PCA (BDPCA) and decomposition methods used in the face recognition domain, in this paper, a modular bidirectional PCA (MBDPCA) based visual feature extraction method was presented. In this method, the original mouth image sequences are divided into smaller sub-images, and two approaches are compared to build the covariance matrix: one is using all the sub-image sets together to build a global covariance matrix; the other is using the different sub-image sets independently to build the local covariance matrices. Then the BDPCA is applied to each sub-image set. Experimental results show that the MBDPCA method has a better performance than both the conventional PCA and BDPCA methods; moreover, further experimental results demonstrate that our lip-reading system provides significant enhancement of robustness in noisy environments compared to the audio-only speech recognition.

Keywords :

covariance matrices; face recognition; feature extraction; image sequences; principal component analysis; audio-only speech recognition; audio-visual speech recognition; decomposition methods; face recognition domain; global covariance matrix; lip-reading system; local covariance matrices; modular BDPCA based visual feature representation; original mouth image sequences; speech information; speech process; visual feature extraction methods; Covariance matrix; Face recognition; Feature extraction; Image sequences; Mouth; Principal component analysis; Robustness; Speech processing; Speech recognition; Working environment noise; audio-visual speech recognition; feature extraction; lip-reading;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on

Conference_Location :

San Diego, CA

ISSN :

1522-4880

Print_ISBN :

978-1-4244-1765-0

Electronic_ISBN :

1522-4880

Type :

conf

DOI :

10.1109/ICIP.2008.4712008

Filename :

4712008

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1867458