• DocumentCode
    332273
  • Title

    Audiovisual speech enhancement: new advances using multi-layer perceptrons

  • Author

    Girin, L. ; Varin, L. ; Feng, G. ; Schwartz, J.-L.

  • Author_Institution
    Inst. de la Commun. Parlee, Stendhal Univ., Grenoble, France
  • fYear
    1998
  • fDate
    7-9 Dec 1998
  • Firstpage
    77
  • Lastpage
    82
  • Abstract
    This paper deals with the improvement of a noisy speech enhancement system based on the fusion of auditory and visual information. The system was presented in previous papers and implemented with a simple stimuli corrupted with white noise. Its principle consists of an analysis-enhancement-synthesis process based on a linear prediction (LP) model of the signal: the LP filter is enhanced thanks to associative tools that estimate the LP cleaned parameters from both noisy audio and lip shape information. The structure of the system is reviewed and we focus on the improvement that concerns the associators: multi-layers perceptrons are used instead of linear regression. It is shown that in the context of VCV transitions corrupted with white noise, the performances of the system are improved in terms of the intelligibility gain, distance measures and classification tests
  • Keywords
    audio signal processing; audio-visual systems; filtering theory; image processing; multilayer perceptrons; prediction theory; sensor fusion; speech enhancement; speech intelligibility; white noise; Gaussian classification tests; LP cleaned parameters; LP filter; VCV transitions; analysis-enhancement-synthesis process; audio-visual speech enhancement; auditory information; distance measures; informal listening tests; information fusion; intelligibility gain; linear prediction model; linear regression; lip shape information; multi-layer perceptrons; noisy audio; noisy speech enhancement system; system performance; visual information; white noise; Information analysis; Information filtering; Information filters; Multi-stage noise shaping; Nonlinear filters; Predictive models; Signal analysis; Signal processing; Speech enhancement; White noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia Signal Processing, 1998 IEEE Second Workshop on
  • Conference_Location
    Redondo Beach, CA
  • Print_ISBN
    0-7803-4919-9
  • Type

    conf

  • DOI
    10.1109/MMSP.1998.738916
  • Filename
    738916