مرکز منطقه ای اطلاع رساني علوم و فناوري - Continuous visual speech recognition for audio speech enhancement

DocumentCode :

3420647

Title :

Continuous visual speech recognition for audio speech enhancement

Author :

Benhaim, Eric ; Sahbi, Hichem ; Vittey, Guillaume

Author_Institution :

LTCI, Telecom ParisTech, Paris, France

fYear :

2015

fDate :

19-24 April 2015

Firstpage :

2244

Lastpage :

2248

Abstract :

We introduce in this paper a novel non-blind speech enhancement procedure based on visual speech recognition (VSR). The latter is based on a generative process that analyzes sequences of talking faces and classifies them into visual speech units known as visemes. We use an effective graphical model able to segment and label a given sequence of talking faces into a sequence of visemes. Our model captures unary potential as well as pairwise interaction; the former models visual appearance of speech units while the latter models their interactions using boundary and visual language model activations. Experiments conducted on a standard challenging dataset, show that when feeding the results of VSR to the speech enhancement procedure, it clearly outperforms baseline blind methods as well as related work.

Keywords :

speech enhancement; speech recognition; visual languages; VSR; boundary model activations; generative process; non-blind speech enhancement procedure; talking faces; visemes; visual appearance; visual language model activations; visual speech recognition; visual speech units; Graphical models; Hidden Markov models; Noise; Speech; Speech enhancement; Speech recognition; Visualization; Visual speech recognition; belief propagation; model-based speech enhancement; probabilistic graphical model;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location :

South Brisbane, QLD

Type :

conf

DOI :

10.1109/ICASSP.2015.7178370

Filename :

7178370

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3420647