DocumentCode
3424178
Title
Gaze-contingent asr for spontaneous, conversational speech: An evaluation
Author
Cooke, Neil ; Russell, Martin
Author_Institution
Multi-modal Interaction Lab., Birmingham Univ., Birmingham
fYear
2008
fDate
March 31 2008-April 4 2008
Firstpage
4433
Lastpage
4436
Abstract
There has been little work that attempts to improve the recognition of spontaneous, conversational speech by adding information from a loosely-coupled modality. This study investigated this idea by integrating information from gaze into an ASR system. A probabilistic framework for multimodal recognition was formalised and applied to the specific case of integrating gaze and speech. Gaze-contingent ASR systems were developed from a baseline ASR system by redistributing language model probability mass according to the visual attention. The best performing systems had similar Word Error Rates to the baseline ASR system and showed an increase in keyword spotting accuracy. The key finding was that performance improvements observed were due to increased recognition accuracy for words associated with the visual field but not the current focus of visual attention.
Keywords
speech recognition; word processing; automatic speech recognition; gaze-contingent ASR; keyword spotting accuracy; language model probability mass; loosely-coupled modality; spontaneous conversational speech; visual attention; word error rates; Automatic speech recognition; Error analysis; Human computer interaction; Laboratories; Maximum likelihood decoding; Speech analysis; Speech recognition; User interfaces; Visual system; Vocabulary; Bayes procedures; Speech recognition; User interfaces; Visual system;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location
Las Vegas, NV
ISSN
1520-6149
Print_ISBN
978-1-4244-1483-3
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2008.4518639
Filename
4518639
Link To Document