• DocumentCode
    2324609
  • Title

    DTCWT-based dynamic texture features for visual speech recognition

  • Author

    Feng, Xiaohui ; Wang, Weining

  • Author_Institution
    Sch. of Electron. & Inf. Eng., South China Univ. of Technol., Guangzhou
  • fYear
    2008
  • fDate
    Nov. 30 2008-Dec. 3 2008
  • Firstpage
    497
  • Lastpage
    500
  • Abstract
    In this paper, new visual dynamic texture features based on dual tree complex wavelet transform (DTCWT) are proposed to capture important lip motion information for speech recognition. Lip texture features are extracted by DTCWT, for its approximate shift invariance and good directional selectivity. Canberra distances between lip texture features in adjacent frames are used as visual dynamic features. Experiments evaluations on Chinese corpus demonstrate that the performance of visual dynamic texture features is superior to static features in visual speech recognition. Compared to commonly used Canberra distances features, the dynamic texture features lead to about 8% improvement on word accuracy.
  • Keywords
    feature extraction; image texture; speech recognition; trees (mathematics); wavelet transforms; Canberra distances; Chinese corpus; dual tree complex wavelet transform; dynamic texture features; feature extraction; lip motion information; shift invariance; visual speech recognition; Acoustic noise; Automatic speech recognition; Data mining; Discrete wavelet transforms; Feature extraction; Filters; Man machine systems; Speech recognition; Video sequences; Wavelet transforms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Circuits and Systems, 2008. APCCAS 2008. IEEE Asia Pacific Conference on
  • Conference_Location
    Macao
  • Print_ISBN
    978-1-4244-2341-5
  • Electronic_ISBN
    978-1-4244-2342-2
  • Type

    conf

  • DOI
    10.1109/APCCAS.2008.4746069
  • Filename
    4746069