• DocumentCode
    1623304
  • Title

    Rate-invariant comparisons of covariance paths for visual speech recognition

  • Author

    Jingyong Su ; Srivastava, Anurag ; Souza, Francisco ; Sarkar, Santonu

  • Author_Institution
    Dept. of Math. & Stat., Texas Tech Univ., Lubbock, TX, USA
  • fYear
    2013
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    An important problem in speech, and generally activity, recognition is to develop analyses that are invariant to the execution rates. We introduce a theoretical framework that provides a parametrization-invariant metric for comparing parametrized paths on Riemannian manifolds. Treating instances of activities as parametrized paths on a Riemannian manifold of covariance matrices, we apply this framework to the problem of visual speech recognition from image sequences. We represent each sequence as a path on the space of covariance matrices, each covariance matrix capturing spatial variability of visual features in a frame, and perform simultaneous pairwise temporal alignment and comparison of paths. This removes the temporal variability and helps provide a robust metric for visual speech classification. We evaluated this idea on the OuluVS database and the rank-1 nearest neighbor classification rate improves from 32% to 57% due to temporal alignment.
  • Keywords
    covariance matrices; image sequences; pattern classification; speech recognition; Riemannian manifolds; covariance matrices; covariance matrix; covariance paths; image sequences; nearest neighbor classification; rate invariant comparisons; robust metric; spatial variability; temporal alignment; visual speech classification; visual speech recognition; Covariance matrices; Databases; Lips; Measurement; Registers; Speech; Tongue;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), 2013 Fourth National Conference on
  • Conference_Location
    Jodhpur
  • Print_ISBN
    978-1-4799-1586-6
  • Type

    conf

  • DOI
    10.1109/NCVPRIPG.2013.6776200
  • Filename
    6776200