• DocumentCode
    3352809
  • Title

    Inter-frame contextual modelling for visual speech recognition

  • Author

    Pass, Adrian ; Ming, Ji ; Hanna, Philip ; Zhang, Jianguo ; Stewart, Darryl

  • Author_Institution
    Sch. of Electron., Electr. Eng. & Comput. Sci., Queens Univ., Belfast, UK
  • fYear
    2010
  • fDate
    26-29 Sept. 2010
  • Firstpage
    93
  • Lastpage
    96
  • Abstract
    In this paper, we present a new approach to visual speech recognition which improves contextual modelling by combining Inter-Frame Dependent and Hidden Markov Models. This approach captures contextual information in visual speech that may be lost using a Hidden Markov Model alone. We apply contextual modelling to a large speaker independent isolated digit recognition task, and compare our approach to two commonly adopted feature based techniques for incorporating speech dynamics. Results are presented from baseline feature based systems and the combined modelling technique. We illustrate that both of these techniques achieve similar levels of performance when used independently. However significant improvements in performance can be achieved through a combination of the two. In particular we report an improvement in excess of 17% relative Word Error Rate in comparison to our best baseline system.
  • Keywords
    hidden Markov models; speech recognition; digit recognition; hidden Markov model; interframe contextual modelling; speech dynamic; visual speech recognition; Context modeling; Feature extraction; Hidden Markov models; Principal component analysis; Speech; Speech recognition; Visualization; AVASR; Contextual modelling; lipreading; speech dynamics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing (ICIP), 2010 17th IEEE International Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1522-4880
  • Print_ISBN
    978-1-4244-7992-4
  • Electronic_ISBN
    1522-4880
  • Type

    conf

  • DOI
    10.1109/ICIP.2010.5652630
  • Filename
    5652630