Title :
Audio-visual automatic speech recognition and related bimodal speech technologies: A review of the state-of-the-art and open problems
Author :
Potamianos, Gerasimos
Author_Institution :
Inst. of Inf. & Telecommun., Nat. Centre for Sci. Res. Demokritos, Athens, Greece
fDate :
Nov. 13 2009-Dec. 17 2009
Abstract :
Summary form only given. The presentation will provide an overview of the main research achievements and the state-of-the-art in the area of audiovisual speech processing, mainly focusing in the area of audio-visual automatic speech recognition. The topic has been of interest in the speech research community due to the potential of increased robustness to acoustic noise that the visual modality holds. Nevertheless, significant challenges remain that have hindered practical applications of the technology most notably difficulties with visual speech information extraction and audio-visual fusion algorithms that remain robust to the audio-visual environment variability inherent in practical, unconstrained interaction scenarios and audio-visual data sources, for example multiparty interaction in smart spaces, broadcast news, etc. These challenges are also shared across a number of interesting audio-visual speech technologies beyond the core speech recognition problem, where the visual modality has the potential to resolve ambiguity inherent in the audio signal alone; for example, speech activity detection, speaker diarization, and source separation.
Keywords :
audio-visual systems; speech recognition; acoustic noise; audio-visual automatic speech recognition; audio-visual data sources; audio-visual fusion algorithm; bimodal speech technology; visual modality; visual speech information extraction; Acoustic noise; Automatic speech recognition; Broadcast technology; Broadcasting; Data mining; Noise robustness; Space technology; Speech enhancement; Speech processing; Speech recognition;
Conference_Titel :
Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
Conference_Location :
Merano
Print_ISBN :
978-1-4244-5478-5
Electronic_ISBN :
978-1-4244-5479-2
DOI :
10.1109/ASRU.2009.5373530