Title :
Utilizing multimodal cues to automatically evaluate public speaking performance
Author :
Lei Chen;Chee Wee Leong;Gary Feng;Chong Min Lee;Swapna Somasundaran
Author_Institution :
Educational Testing Service (ETS), 660 Rosedale Rd, Princeton, New Jersey 08541
Abstract :
Public speaking, an important type of oral communication, is critical to success in both learning and career development. However, there is a lack of tools to efficiently and economically evaluate presenters´ verbal and nonverbal behaviors. The recent advancements in automated scoring and multimodal sensing technologies may address this issue. We report a study on the development of an automated scoring model for public speaking performance using multimodal cues. A multimodal presentation corpus containing 14 subjects´ 56 presentations has been recorded using a Microsoft Kinect depth camera. Task design, rubric development, and human rating were conducted according to standards in educational assessment. A rich set of multimodal features has been extracted from head poses, eye gazes, facial expressions, motion traces, speech signal, and transcripts. The model building experiment shows that jointly using both lexical/speech and visual features achieves more accurate scoring, which suggests the feasibility of using multimodal technologies in the assessment of public speaking skills.
Keywords :
"Public speaking","Speech","Feature extraction","Reliability","Tracking","Multimodal sensors","Cameras"
Conference_Titel :
Affective Computing and Intelligent Interaction (ACII), 2015 International Conference on
Electronic_ISBN :
2156-8111
DOI :
10.1109/ACII.2015.7344601