• DocumentCode
    394747
  • Title

    An audio-visual approach to simultaneous-speaker speech recognition

  • Author

    Patterson, E.K. ; Gowdy, J.N.

  • Author_Institution
    Dept. of Comput. Sci., North Carolina Univ., Wilmington, NC, USA
  • Volume
    5
  • fYear
    2003
  • fDate
    6-10 April 2003
  • Abstract
    Audio-visual speech recognition is an area with great potential to help solve challenging problems in speech processing. Difficulties due to background noise are significantly reduced by the additional information provided by extra visual features. The presence of additional speech from other talkers during recording may be viewed as one of the most difficult sources of noise. The paper presents a study using audio-visual speech recognition for simultaneous-speaker speech recognition. The desired goal is to separate and potentially recognize speech from several simultaneous speakers. Speaker pairs from the CUAVE multimodal speech corpus (see http://ece.clemson.edu/speech) are used. Audio-visual techniques are compared against speaker-independent and speaker-dependent audio-only methods for speech recognition of individuals from these pairs.
  • Keywords
    acoustic noise; audio-visual systems; source separation; speech processing; speech recognition; video signal processing; audio-visual speech recognition; background noise; simultaneous-speaker speech recognition; speech processing; Audio recording; Background noise; Computer science; Loudspeakers; Microphones; Speech enhancement; Speech processing; Speech recognition; Testing; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7663-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2003.1200087
  • Filename
    1200087