• DocumentCode
    758707
  • Title

    Discriminative Analysis of Lip Motion Features for Speaker Identification and Speech-Reading

  • Author

    Cetingul, H.E. ; Yemez, Y. ; Engin Erzin ; Tekalp, A.M.

  • Author_Institution
    Coll. of Eng., Koc Univ.
  • Volume
    15
  • Issue
    10
  • fYear
    2006
  • Firstpage
    2879
  • Lastpage
    2891
  • Abstract
    There have been several studies that jointly use audio, lip intensity, and lip geometry information for speaker identification and speech-reading applications. This paper proposes using explicit lip motion information, instead of or in addition to lip intensity and/or geometry information, for speaker identification and speech-reading within a unified feature selection and discrimination analysis framework, and addresses two important issues: 1) Is using explicit lip motion information useful, and, 2) if so, what are the best lip motion features for these two applications? The best lip motion features for speaker identification are considered to be those that result in the highest discrimination of individual speakers in a population, whereas for speech-reading, the best features are those providing the highest phoneme/word/phrase recognition rate. Several lip motion feature candidates have been considered including dense motion features within a bounding box about the lip, lip contour motion features, and combination of these with lip shape features. Furthermore, a novel two-stage, spatial, and temporal discrimination analysis is introduced to select the best lip motion features for speaker identification and speech-reading applications. Experimental results using an hidden-Markov-model-based recognition system indicate that using explicit lip motion information provides additional performance gains in both applications, and lip motion features prove more valuable in the case of speech-reading application
  • Keywords
    face recognition; gesture recognition; hidden Markov models; speaker recognition; discriminative analysis; explicit lip contour motion features; hidden-Markov-model-based recognition system; lip shape features; phoneme-word-phrase recognition rate; speaker identification; speech-reading application; unified feature selection; Discrete cosine transforms; Information geometry; Linear discriminant analysis; Motion analysis; Principal component analysis; Shape; Speech analysis; Speech recognition; Testing; Vectors; Bayesian discriminative feature selection; lip motion; speaker identification; speech recognition; temporal discriminative feature selection; Algorithms; Artificial Intelligence; Biometry; Discriminant Analysis; Humans; Image Enhancement; Image Interpretation, Computer-Assisted; Information Storage and Retrieval; Lip; Lipreading; Movement; Pattern Recognition, Automated; Speech; Speech Recognition Software;
  • fLanguage
    English
  • Journal_Title
    Image Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1057-7149
  • Type

    jour

  • DOI
    10.1109/TIP.2006.877528
  • Filename
    1703580