• DocumentCode
    2346760
  • Title

    Visual Speech Recognition Using Image Moments and Multiresolution Wavelet Images

  • Author

    Yau, Wai C. ; Kumar, Dinesh K. ; Arjunan, Sridhar P. ; Kumar, Sanjay

  • Author_Institution
    Sch. of Electr. & Comput. Eng., R. Melbourne Inst. of Technol., Vic.
  • fYear
    2006
  • fDate
    26-28 July 2006
  • Firstpage
    194
  • Lastpage
    199
  • Abstract
    This paper describes a new technique for recognizing speech using visual speech information. The video data of the speaker´s mouth is represented using grayscale images named as motion history image (MHI). MHI is generated by applying accumulative image differencing on the frames of the video to implicitly represent the temporal information of the mouth movement. The MHIs are decomposed into wavelet sub images using discrete stationary wavelet transform (SWT). Three moment-based features (geometric moments, Zernike moments and Hu moments) are extracted from the SWT approximate sub images. Multilayer perceptron (MLP) type artificial neural network (ANN) with back propagation learning algorithm is used to classify the moments features. This paper evaluates and compares the image representation ability of the different moments. The initial experiments show that this method can classify English consonants with an error rate less than 5%
  • Keywords
    backpropagation; discrete wavelet transforms; feature extraction; image motion analysis; image representation; image sequences; multilayer perceptrons; speech recognition; video signal processing; artificial neural network; back propagation learning algorithm; discrete stationary wavelet transform; grayscale image; image moment; image representation; moment-based feature extraction; motion history image; mouth movement; multilayer perceptron; multiresolution wavelet transform; temporal information; video signal processing; visual speech recognition; Artificial neural networks; Data mining; Discrete wavelet transforms; Gray-scale; History; Image representation; Image resolution; Mouth; Multilayer perceptrons; Speech recognition; discrete stationary wavelet transform; image moments; motion history image; visual speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Graphics, Imaging and Visualisation, 2006 International Conference on
  • Conference_Location
    Sydney, Qld.
  • Print_ISBN
    0-7695-2606-3
  • Type

    conf

  • DOI
    10.1109/CGIV.2006.92
  • Filename
    1663790