DocumentCode
2324609
Title
DTCWT-based dynamic texture features for visual speech recognition
Author
Feng, Xiaohui ; Wang, Weining
Author_Institution
Sch. of Electron. & Inf. Eng., South China Univ. of Technol., Guangzhou
fYear
2008
fDate
Nov. 30 2008-Dec. 3 2008
Firstpage
497
Lastpage
500
Abstract
In this paper, new visual dynamic texture features based on dual tree complex wavelet transform (DTCWT) are proposed to capture important lip motion information for speech recognition. Lip texture features are extracted by DTCWT, for its approximate shift invariance and good directional selectivity. Canberra distances between lip texture features in adjacent frames are used as visual dynamic features. Experiments evaluations on Chinese corpus demonstrate that the performance of visual dynamic texture features is superior to static features in visual speech recognition. Compared to commonly used Canberra distances features, the dynamic texture features lead to about 8% improvement on word accuracy.
Keywords
feature extraction; image texture; speech recognition; trees (mathematics); wavelet transforms; Canberra distances; Chinese corpus; dual tree complex wavelet transform; dynamic texture features; feature extraction; lip motion information; shift invariance; visual speech recognition; Acoustic noise; Automatic speech recognition; Data mining; Discrete wavelet transforms; Feature extraction; Filters; Man machine systems; Speech recognition; Video sequences; Wavelet transforms;
fLanguage
English
Publisher
ieee
Conference_Titel
Circuits and Systems, 2008. APCCAS 2008. IEEE Asia Pacific Conference on
Conference_Location
Macao
Print_ISBN
978-1-4244-2341-5
Electronic_ISBN
978-1-4244-2342-2
Type
conf
DOI
10.1109/APCCAS.2008.4746069
Filename
4746069
Link To Document