Title :
Viseme recognition - a comparative study
Author :
Leszczynski, M. ; Skarbek, W.
Author_Institution :
Fac. of Electron. & Inf. Technol., Warsaw Univ. of Technol., Poland
Abstract :
Three classification algorithms for visual mouth appearances (visemes) which correspond to phonemes and their speech contexts, were compared w.rt. recognition rate, time complexity, and ROC performance. Two feature extraction procedures were verified. The first one is based on the normalized triangle MESH covering mouth area and the color image texture vector indexed by barycentric coordinates. The second procedure performs DFT on the image rectangle including mouth w.rt. small blocks of DFT coefficients. The classifiers has been designed by PCA approach and by the optimized LDA method which uses two singular subspaces approach. It appears that DFT+LDA exhibits higher recognition rate than MESH+LDA and MESH+PCA methods - 97.6% versus 94.4 and 90.2%, respectively. It is also much faster than MESH+PCA (5 ms per one video frame versus 26 ms on Pentium IV, 3.2 GHz) and slower than MESH+LDA (5 ms versus 1 ms).
Keywords :
discrete Fourier transforms; feature extraction; gesture recognition; image classification; image colour analysis; image texture; principal component analysis; DFT; PCA approach; barycentric coordinates; classification algorithms; color image texture vector; feature extraction; linear discriminant analysis; viseme recognition; Classification algorithms; Facial animation; Head; High performance computing; Humans; Information technology; Linear discriminant analysis; Mouth; Principal component analysis; Speech recognition;
Conference_Titel :
Advanced Video and Signal Based Surveillance, 2005. AVSS 2005. IEEE Conference on
Print_ISBN :
0-7803-9385-6
DOI :
10.1109/AVSS.2005.1577282