DocumentCode :
701486
Title :
Asynchronous integration of audio and visual sources in bi-modal automatic speech recognition
Author :
Deleglise, Paul ; Rogozan, Alexandrina ; Alissali, Mamoun
Author_Institution :
LIUM, University of Maine, Av. Olivier Messiaen, BP 535, 72017 Le Mans Cedex, France
fYear :
1996
fDate :
10-13 Sept. 1996
Firstpage :
1
Lastpage :
4
Abstract :
This paper presents our work on the integration of visual data in automatic speech recognition systems. We particularly aim at solving two problems: • classifiation differences for the modeling of acoustic information (phonemes) and visual information (visemes); • the phenomena of anticipation and retention of visemes on the corresponding phonemes. We developed and tested three systems, each dealing with one or both problems and proposing a different integration strategy. The comparison of system performances show that some of the solutions we propose give satisfactory results, and suggest that further work on some others would lead to more performance improvement.
Keywords :
Acoustics; Hidden Markov models; Noise; Shape; Speech; Speech recognition; Visualization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
European Signal Processing Conference, 1996. EUSIPCO 1996. 8th
Conference_Location :
Trieste, Italy
Print_ISBN :
978-888-6179-83-6
Type :
conf
Filename :
7083212
Link To Document :
بازگشت