Title :
Asynchronous integration of audio and visual sources in bi-modal automatic speech recognition
Author :
Deleglise, Paul ; Rogozan, Alexandrina ; Alissali, Mamoun
Author_Institution :
LIUM, University of Maine, Av. Olivier Messiaen, BP 535, 72017 Le Mans Cedex, France
Abstract :
This paper presents our work on the integration of visual data in automatic speech recognition systems. We particularly aim at solving two problems: • classifiation differences for the modeling of acoustic information (phonemes) and visual information (visemes); • the phenomena of anticipation and retention of visemes on the corresponding phonemes. We developed and tested three systems, each dealing with one or both problems and proposing a different integration strategy. The comparison of system performances show that some of the solutions we propose give satisfactory results, and suggest that further work on some others would lead to more performance improvement.
Keywords :
Acoustics; Hidden Markov models; Noise; Shape; Speech; Speech recognition; Visualization;
Conference_Titel :
European Signal Processing Conference, 1996. EUSIPCO 1996. 8th
Conference_Location :
Trieste, Italy
Print_ISBN :
978-888-6179-83-6