• DocumentCode
    1672044
  • Title

    Improved ROI and within frame discriminant features for lipreading

  • Author

    Potamianos, Gerasimos ; Neti, Chalapathy

  • Author_Institution
    IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
  • Volume
    3
  • fYear
    2001
  • fDate
    6/23/1905 12:00:00 AM
  • Firstpage
    250
  • Abstract
    We study three aspects of designing appearance based visual features for automatic lipreading: (a) the choice of the video region of interest (ROI) on which image transform features are obtained; (b) the extraction of speech discriminant features at each frame; (c) the use of temporal information to improve visual speech modeling. With respect to (a), we propose a ROI that includes the speaker´s jaw and cheeks, in addition to the traditionally used mouth/lip region. With respect to (b) and (c), we propose the use of a two-stage linear discriminant analysis, both within a single frame and across a large number of frames. On a large-vocabulary, continuous-speech, audio-visual database, the proposed visual features result in a 13% absolute reduction in visual-only word error rate over a baseline visual front end, and in an additional 28% relative improvement in audio-visual over audio-only phonetic classification accuracy
  • Keywords
    discrete cosine transforms; feature extraction; image recognition; image sequences; speech recognition; audio-visual database; automatic lipreading; automatic speech recognition; continuous speech database; discrete cosine transform; discriminant features; large vocabulary database; linear discriminant analysis; speech discriminant feature extraction; temporal information; video region of interest; visual speech modeling; Algorithm design and analysis; Automatic speech recognition; Discrete cosine transforms; Discrete wavelet transforms; Feature extraction; Linear discriminant analysis; Mouth; Shape; Speech recognition; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing, 2001. Proceedings. 2001 International Conference on
  • Conference_Location
    Thessaloniki
  • Print_ISBN
    0-7803-6725-1
  • Type

    conf

  • DOI
    10.1109/ICIP.2001.958098
  • Filename
    958098