• DocumentCode
    3489890
  • Title

    Joint audio-video processing for biometric speaker identification

  • Author

    Kanak, A. ; Erzin, E. ; Yemez, I. ; Tekalp, A. Murat

  • Author_Institution
    Multimedia, Vision & Graphics Lab., Koc Univ., Istanbul, Turkey
  • Volume
    2
  • fYear
    2003
  • fDate
    6-10 April 2003
  • Abstract
    We present a bimodal audio-visual speaker identification system. The objective is to improve the recognition performance over conventional unimodal schemes. The proposed system exploits not only the temporal and spatial correlations existing in the speech and video signals of a speaker, but also the cross-correlation between these two modalities. Lip images extracted from each video frame are transformed onto an eigenspace. The obtained eigenlip coefficients are interpolated to match the rate of the speech signal and fused with Mel frequency cepstral coefficients (MFCC) of the corresponding speech signal. The resulting joint feature vectors are used to train and test a hidden Markov model (HMM) based identification system. Experimental results are included to demonstrate the system performance.
  • Keywords
    biometrics (access control); covariance matrices; eigenvalues and eigenfunctions; face recognition; hidden Markov models; interpolation; learning (artificial intelligence); speaker recognition; speech processing; video signal processing; HMM; Mel frequency cepstral coefficients; bimodal speaker identification; biometric speaker identification; covariance matrix; eigenlip coefficients; eigenspace; hidden Markov model; joint audio-video processing; lip images; speech signals; video signals; Biometrics; Educational institutions; Graphics; Hidden Markov models; Laboratories; Multimedia systems; Robustness; Signal processing; Speech; Streaming media;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7663-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2003.1202376
  • Filename
    1202376