Title :
Video and audio based detection of filled hesitation pauses in classroom lectures
Author :
Tsiaras, Vassilis ; Panagiotakis, Costas ; Stylianou, Yannis
Author_Institution :
Dept. of Comput. Sci., Univ. of Crete, Heraklion, Greece
Abstract :
In this paper we study the detection of hesitation filled pauses in oral presentations of university lectures taught in the Greek language and recorded using a tablet PC via a specialized software. We suggest a hierarchical approach fusing video data with audio data for increasing the precision rate in our detection system. The detection method works at frame level rather than the usual segmental level for more accurate synchronization of audio and video data after removing the detected hesitations. Audio characteristics are modeled using Gaussian Mixture Models while the stationarity of the recorded video is taken into account. This efficient video and audio combination yields higher precision and recall rates comparing with other works in the literature. On a dataset of approximately 7 hours the precision rate is 99.6% while the recall rate is 84.7% when audio and video data are taken into account.
Keywords :
Gaussian processes; audio signal processing; educational computing; educational institutions; mixture models; sensor fusion; speech processing; video signal processing; Gaussian mixture models; Greek language; audio based detection; audio characteristics modeling; audio data synchronization; classroom lectures; hesitation filled pause detection; hierarchical approach; oral presentations; specialized software; tablet PC; university lectures; video based detection; video data fusion; video data synchronization; Abstracts; Hip; Markov processes;
Conference_Titel :
Signal Processing Conference, 2009 17th European
Conference_Location :
Glasgow
Print_ISBN :
978-161-7388-76-7