• DocumentCode
    2018595
  • Title

    Improving connected letter recognition by lipreading

  • Author

    Bregler, Christoph ; Hild, Hermunn ; Manke, Stefan ; Waibel, Alex

  • Author_Institution
    Dept. of Comput. Sci., Karlsruhe Univ., Germany
  • Volume
    1
  • fYear
    1993
  • fDate
    27-30 April 1993
  • Firstpage
    557
  • Abstract
    The authors show how recognition performance in automated speech perception can be significantly improved by additional lipreading, so called speech-reading. They show this on an extension of a state-of-the-art speech recognition system, a modular multistage time delay neural network architecture (MS-TDNN). The acoustic and visual speech data are preclassified in two separate front-end phoneme TDNNs and combined with acoustic-visual hypotheses for the dynamic time warping algorithm. This is shown on a connected word recognition problem, the notoriously difficult letter spelling task. With speech-reading, the error rate could be reduced by up to half of the error rate of the pure acoustic recognition.<>
  • Keywords
    audio acoustics; audio-visual systems; errors; neural nets; speech recognition; acoustic-visual hypotheses; automated speech perception; connected letter recognition; dynamic time warping algorithm; error rate; lipreading; modular multistage time delay neural network; speech recognition system; speech-reading;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
  • Conference_Location
    Minneapolis, MN, USA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7402-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.1993.319179
  • Filename
    319179