Title :
DTCWT-based dynamic texture features for visual speech recognition
Author :
Feng, Xiaohui ; Wang, Weining
Author_Institution :
Sch. of Electron. & Inf. Eng., South China Univ. of Technol., Guangzhou
fDate :
Nov. 30 2008-Dec. 3 2008
Abstract :
In this paper, new visual dynamic texture features based on dual tree complex wavelet transform (DTCWT) are proposed to capture important lip motion information for speech recognition. Lip texture features are extracted by DTCWT, for its approximate shift invariance and good directional selectivity. Canberra distances between lip texture features in adjacent frames are used as visual dynamic features. Experiments evaluations on Chinese corpus demonstrate that the performance of visual dynamic texture features is superior to static features in visual speech recognition. Compared to commonly used Canberra distances features, the dynamic texture features lead to about 8% improvement on word accuracy.
Keywords :
feature extraction; image texture; speech recognition; trees (mathematics); wavelet transforms; Canberra distances; Chinese corpus; dual tree complex wavelet transform; dynamic texture features; feature extraction; lip motion information; shift invariance; visual speech recognition; Acoustic noise; Automatic speech recognition; Data mining; Discrete wavelet transforms; Feature extraction; Filters; Man machine systems; Speech recognition; Video sequences; Wavelet transforms;
Conference_Titel :
Circuits and Systems, 2008. APCCAS 2008. IEEE Asia Pacific Conference on
Conference_Location :
Macao
Print_ISBN :
978-1-4244-2341-5
Electronic_ISBN :
978-1-4244-2342-2
DOI :
10.1109/APCCAS.2008.4746069