DocumentCode :
806723
Title :
A Study in Efficiency and Modality Usage in Multimodal Form Filling Systems
Author :
Perakakis, Manolis ; Potamianos, Alexandros
Author_Institution :
Dept. of Electron. & Comput. Eng., Tech. Univ. of Crete, Chania
Volume :
16
Issue :
6
fYear :
2008
Firstpage :
1194
Lastpage :
1206
Abstract :
The usage patterns of speech and visual input modes are investigated as a function of relative input mode efficiency for both desktop and personal digital assistant (PDA) working environments. For this purpose the form-filling part of a multimodal dialogue system is implemented and evaluated; three multimodal modes of interaction are implemented: ldquoClick-to-Talk,rdquo ldquoOpen-Mike,rdquo and ldquoModality-Selection.rdquo ldquoModality-Selectionrdquo implements an adaptive interface where the system selects the most efficient input mode at each turn, effectively alternating between a ldquoClick-to-Talkrdquo and ldquoOpen-Mikerdquo interaction style as proposed in ldquoModality tracking in the multimodal Bell Labs Communicator,rdquo in Proceedings of the Automatic Speech Recognition and Understanding Workshop, by A. Potamianos, , 2003. The multimodal systems are evaluated and compared with the unimodal systems. Objective and subjective measures used include task completion, task duration, turn duration, and overall user satisfaction. Turn duration is broken down into interaction time and inactivity time to better measure the efficiency of each input mode. Duration statistics and empirical probability density functions are computed as a function of interaction context and user. Results show that the multimodal systems outperform the unimodal systems in terms of objective and subjective criteria. Also, users tend to use the most efficient input mode at each turn; however, biases towards the default input modality and a general bias towards the speech modality also exists. Results demonstrate that although users exploit some of the available synergies in multimodal dialogue interaction, further efficiency gains can be achieved by designing adaptive interfaces that fully exploit these synergies.
Keywords :
human factors; interactive systems; speech recognition; speech-based user interfaces; Automatic Speech Recognition and Understanding Workshop; adaptive interface; click-to-talk mode; duration statistics; empirical probability density function; modality-selection mode; multimodal Bell Labs communicator; multimodal dialogue system; multimodal form filling system; open-mike mode; personal digital assistant; speech modality; visual input mode; Automatic speech recognition; Filling; Graphical user interfaces; Oral communication; Personal digital assistants; Probability density function; Prototypes; Statistics; Time measurement; User interfaces; Graphical user interfaces (GUIs); input modality selection; mobile multimodal interfaces; speech communication;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2008.2001389
Filename :
4566088
Link To Document :
بازگشت