• DocumentCode
    2869457
  • Title

    Intensity/location normalization for automatic lipreading

  • Author

    Tanaka, Akiji ; Vanegas, Oscar ; Tokuda, Keiichi ; Kitamura, Tadashi

  • Author_Institution
    Dept. of Comput. Sci., Nagoya Inst. of Technol., Japan
  • Volume
    2
  • fYear
    1998
  • fDate
    1998
  • Firstpage
    920
  • Abstract
    This paper describes intensity and location normalization for the improvement of the performance of a speech recognition system by using the visual information in bimodal speech recognition. In conventional speech recognition, many methods have been proposed for normalization of channel characteristics and speaker individuality. In this study, two methods similar to CMN and SAT were proposed for the intensity and location normalization respectively, and two kinds of feature vectors, a subsampled image and a 2D-DCT were compared. Experimental results show that the recognition rates have been very much improved by the normalization techniques
  • Keywords
    discrete cosine transforms; feature extraction; image recognition; image sampling; image sequences; speech recognition; 2D-DCT; automatic lipreading; bimodal speech recognition; channel characteristic normalization; feature vectors; image sequences; intensity/location normalization; recognition rate; speaker individuality; speech recognition system; subsampled image; visual information; Cepstral analysis; Character recognition; Computer science; Discrete cosine transforms; Error analysis; Image recognition; Image sequences; Lips; Robustness; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Proceedings, 1998. ICSP '98. 1998 Fourth International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    0-7803-4325-5
  • Type

    conf

  • DOI
    10.1109/ICOSP.1998.770762
  • Filename
    770762