• DocumentCode
    2300278
  • Title

    Hidden Conditional Random Fields for Visual Speech Recognition

  • Author

    Pass, Adrian ; Zhang, Jianguo ; Stewart, Darryl

  • Author_Institution
    Sch. of Electron., Electr. Eng. & Comput. Sci., Queens Univ. Belfast, Belfast, UK
  • fYear
    2009
  • fDate
    2-4 Sept. 2009
  • Firstpage
    117
  • Lastpage
    122
  • Abstract
    In this paper we present the application of hidden conditional random fields (HCRFs) to modeling speech for visual speech recognition. HCRFs may be easily adapted to model long range dependencies across an observation sequence. As a result visual word recognition performance can be improved as the model is able to take more of a contextual approach to generating state sequences. Results are presented from a speaker-dependent, isolated digit, visual speech recognition task using comparisons with a baseline HMM system. We firstly illustrate that word recognition rates on clean video using HCRFs can be improved by increasing the number of past and future observations being taken into account by each state. Secondly we compare model performances using various levels of video compression on the test set. As far as we are aware this is the first attempted use of HCRFs for visual speech recognition.
  • Keywords
    data compression; speech recognition; video coding; baseline HMM system; contextual approach; hidden conditional random fields; speaker-dependent speech recognition; state sequences; video compression; visual speech recognition; visual word recognition performance; Application software; Computer science; Context modeling; Exponential distribution; Hidden Markov models; Image processing; Machine vision; Mouth; Speech recognition; Video compression;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Vision and Image Processing Conference, 2009. IMVIP '09. 13th International
  • Conference_Location
    Dublin
  • Print_ISBN
    978-1-4244-4875-3
  • Electronic_ISBN
    978-0-7695-3796-2
  • Type

    conf

  • DOI
    10.1109/IMVIP.2009.28
  • Filename
    5319309