DocumentCode :
590300
Title :
Geometry based lip reading system using Multi Dimension Dynamic Time Warping
Author :
Ibrahim, M.Z. ; Mulvaney, D.J.
Author_Institution :
Sch. of Electron., Electr. & Syst. Eng., Loughborough Univ., Loughborough, UK
fYear :
2012
fDate :
27-30 Nov. 2012
Firstpage :
1
Lastpage :
6
Abstract :
This paper describes an automatic lip reading system consisting of two main modules 1) a pre-processing module able to extract lip geometry information from the video sequence and 2) a classification module to identify the visual speech based on dynamic lip movements. The recognition performance of the proposed system has been assessed in the recognition of the English digits 0 to 9 as spoken by the speakers in the video sequences available in the CUAVE database. Extraction of lip geometry features was carried out using a combination of a skin color filter, a border following algorithm and a convex hull approach. The proposed method was compared with the popular `snake´ technique and was found to improve lip shape extraction performance for the database studied. Lip geometry features including height, width, ratio, area, perimeter and various combinations of these features were evaluated to determine which performs the best when representing speech in the visual domain in the application of three separate classification methods, namely optical flow, Dynamic Time Warping (DTW) and a new approach termed Multi-Dimensional DTW. Experiments show that the proposed system is capable of a recognition performance of 68% just using lip height, lip width and the ratio of these features demonstrating that the system has the potential to be incorporated in a multimodal speech recognition system for use in noisy environments.
Keywords :
convex programming; feature extraction; filtering theory; image classification; image colour analysis; image sequences; speaker recognition; CUAVE database; automatic lip reading system; border following algorithm; classification module; convex hull approach; dynamic lip movements; geometry based lip reading system; lip geometry feature extraction; lip geometry information extraction; lip shape extraction performance; multidimension dynamic time warping; multidimensional DTW; multimodal speech recognition system; optical flow; skin color filter; snake technique; video sequence; visual speech identification; Databases; Feature extraction; Geometry; Image color analysis; Mouth; Shape; Skin; Lip reading; border following; convex hull; lip shape; multi-dimensional dynamic time warping; skin color filter;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Visual Communications and Image Processing (VCIP), 2012 IEEE
Conference_Location :
San Diego, CA
Print_ISBN :
978-1-4673-4405-0
Electronic_ISBN :
978-1-4673-4406-7
Type :
conf
DOI :
10.1109/VCIP.2012.6410805
Filename :
6410805
Link To Document :
بازگشت