DocumentCode
698084
Title
Video and audio based detection of filled hesitation pauses in classroom lectures
Author
Tsiaras, Vassilis ; Panagiotakis, Costas ; Stylianou, Yannis
Author_Institution
Dept. of Comput. Sci., Univ. of Crete, Heraklion, Greece
fYear
2009
fDate
24-28 Aug. 2009
Firstpage
834
Lastpage
838
Abstract
In this paper we study the detection of hesitation filled pauses in oral presentations of university lectures taught in the Greek language and recorded using a tablet PC via a specialized software. We suggest a hierarchical approach fusing video data with audio data for increasing the precision rate in our detection system. The detection method works at frame level rather than the usual segmental level for more accurate synchronization of audio and video data after removing the detected hesitations. Audio characteristics are modeled using Gaussian Mixture Models while the stationarity of the recorded video is taken into account. This efficient video and audio combination yields higher precision and recall rates comparing with other works in the literature. On a dataset of approximately 7 hours the precision rate is 99.6% while the recall rate is 84.7% when audio and video data are taken into account.
Keywords
Gaussian processes; audio signal processing; educational computing; educational institutions; mixture models; sensor fusion; speech processing; video signal processing; Gaussian mixture models; Greek language; audio based detection; audio characteristics modeling; audio data synchronization; classroom lectures; hesitation filled pause detection; hierarchical approach; oral presentations; specialized software; tablet PC; university lectures; video based detection; video data fusion; video data synchronization; Abstracts; Hip; Markov processes;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing Conference, 2009 17th European
Conference_Location
Glasgow
Print_ISBN
978-161-7388-76-7
Type
conf
Filename
7077658
Link To Document