DocumentCode
2039634
Title
Novel approach for detecting applause in continuous meeting speech
Author
Manoj, C. ; Magesh, S. ; Sankaran, Aditya Sriram ; Manikandan, M. Sabarimalai
Author_Institution
Dept. of Electron. & Commun. Eng., Amrita Vishwa Vidyapeetham, Coimbatore, India
Volume
3
fYear
2011
fDate
8-10 April 2011
Firstpage
182
Lastpage
186
Abstract
This paper proposes a robust and automated applause detection algorithm for meeting speech. The features used in the proposed algorithm are the short-time autocorrelation features such as autocorrelation energy decay factor, amplitude and lag values of first local minimum and zero-crossing points extracted from the autocorrelation sequence of a windowed audio signal. We apply decision thresholds for the above acoustic features to identify applause and non-applause segments from the audio stream. The performance of the proposed algorithm is compared with the conventional method using mel frequency cepstral coefficients (MFCC) feature vectors and Gaussian mixture model (GMM) as classifier. We have also analyzed the performance of these algorithms by varying the number of mixtures in GMM (2, 4, 8, 16 and 32) and various thresholds in the proposed method. The methods are tested with a multimedia database of 4 hours 37 minutes of meeting speech and the results are compared. The precision rate, recall rate and F1 score of the proposed method are 94.40%, 90.75% and 92.54% respectively while those of conventional method are 67.47%, 96.13% and 79.29% respectively.
Keywords
Gaussian processes; audio signal processing; Gaussian mixture model; MFCC feature vectors; acoustic features; amplitude values; autocorrelation energy decay factor; autocorrelation sequence; automated applause detection algorithm; continuous meeting speech; decision thresholds; lag values; mel frequency cepstral coefficients; multimedia database; short-time autocorrelation feature; windowed audio signal; zero crossing points extraction; Correlation; Feature extraction; Finite impulse response filter; Hidden Markov models; Mel frequency cepstral coefficient; Noise; Speech; Audio classification; audio content analysis; semantic video analysis; sports highlight extraction; video summarization;
fLanguage
English
Publisher
ieee
Conference_Titel
Electronics Computer Technology (ICECT), 2011 3rd International Conference on
Conference_Location
Kanyakumari
Print_ISBN
978-1-4244-8678-6
Electronic_ISBN
978-1-4244-8679-3
Type
conf
DOI
10.1109/ICECTECH.2011.5941827
Filename
5941827
Link To Document