• DocumentCode
    1749628
  • Title

    Optimal weighting of posteriors for audio-visual speech recognition

  • Author

    Heckmann, Martin ; Berthommier, Frédéric ; Kroschel, Kristian

  • Author_Institution
    Inst. de la Commuinication Parlee, Inst. Nat. Polytech. de Grenoble, France
  • Volume
    1
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    161
  • Abstract
    We investigate the fusion of audio and video a posteriori phonetic probabilities in a hybrid ANN/HMM audio-visual speech recognition system. Three basic conditions to the fusion process are stated and implemented in a linear and a geometric weighting scheme. These conditions are the assumption of conditional independence of the audio and video data and the contribution of only one of the two paths when the SNR is very high or very low, respectively. In the case of the geometric weighting a new weighting scheme is developed whereas the linear weighting follows the full combination approach as employed in multi-stream recognition. We compare these two new concepts in audio-visual recognition to a rather standard approach known from the literature. Recognition tests were performed in a continuous number recognition task on a single speaker database containing 1712 utterances with two different types of noise added
  • Keywords
    Gaussian noise; audio signal processing; hidden Markov models; neural nets; probability; sensor fusion; speech recognition; video signal processing; white noise; a posteriori phonetic probabilities; audio-visual speech recognition; continuous number recognition task; full combination approach; geometric weighting scheme; hybrid ANN/HMM system; linear weighting scheme; optimal weighting; posteriors; single speaker database; Acoustic noise; Audio databases; Feature extraction; Hidden Markov models; Lips; Performance evaluation; Spatial databases; Speech recognition; Streaming media; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
  • Conference_Location
    Salt Lake City, UT
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7041-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2001.940792
  • Filename
    940792