• DocumentCode
    3324567
  • Title

    AN investigation into features for multi-view lipreading

  • Author

    Pass, Adrian ; Zhang, Jianguo ; Stewart, Darryl

  • Author_Institution
    Sch. of Electron., Queens Univ. Belfast, Belfast, UK
  • fYear
    2010
  • fDate
    26-29 Sept. 2010
  • Firstpage
    2417
  • Lastpage
    2420
  • Abstract
    For the first time in this paper we present results showing the effect of speaker head pose angle on automatic lip-reading performance over a wide range of closely spaced angles. We analyse the effect head pose has upon the features themselves and show that by selecting coefficients with minimum variance w.r.t. pose angle, recognition performance can be improved when train-test pose angles differ. Experiments are conducted using the initial phase of a unique multi view Audio-Visual database designed specifically for research and development of pose-invariant lip-reading systems. We firstly show that it is the higher order horizontal spatial frequency components that become most detrimental as the pose deviates. Secondly we assess the performance of different feature selection masks across a range of pose angles including a new mask based on Minimum Cross-Pose Variance coefficients. We report a relative improvement of 50% in Word Error Rate when using our selection mask over a common energy based selection during profile view lip-reading.
  • Keywords
    feature extraction; pose estimation; automatic lip-reading performance; common energy based selection; feature selection masks; higher order horizontal spatial frequency components; minimum cross-pose variance coefficients; multiview audio-visual database; multiview lipreading; speaker head pose angle effect; train-test pose angles; word error rate; Databases; Discrete cosine transforms; Feature extraction; Mouth; Speech recognition; Visualization; AVASR; discrete cosine transform; feature extraction; pose invariance;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing (ICIP), 2010 17th IEEE International Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1522-4880
  • Print_ISBN
    978-1-4244-7992-4
  • Electronic_ISBN
    1522-4880
  • Type

    conf

  • DOI
    10.1109/ICIP.2010.5650963
  • Filename
    5650963