Title :
Speech vs music discrimination using Empirical Mode Decomposition
Author :
Khonglah, Banriskhem K. ; Sharma, Rajib ; Mahadeva Prasanna, S.R.
Author_Institution :
Dept. of Electron. & Electr. Eng., Indian Inst. of Technol. Guwahati, Guwahati, India
fDate :
Feb. 27 2015-March 1 2015
Abstract :
This work explores the use of Empirical Mode Decomposition (EMD) for discriminating speech regions from music in audio recordings. The different frequency scales or Intrinsic Mode Functions (IMFs) obtained from EMD of the audio signal are found to contain discriminatory evidence for distinguishing the speech regions from the music regions of the audio signal. Different statistical measures like mean, absolute mean, variance, skewness and kurtosis are computed from the various IMFs and investigated for speech vs music discrimination. These features on being used for classification using classifiers like Support Vector Machines (SVMs) and k-Nearest Neighbour (k-NN) on the Scheirer and Slaney database gives the best overall classification accuracy of 90.83% for the SVMs and 85.33% for the k-NN.
Keywords :
audio signal processing; music; speech processing; support vector machines; EMD analysis tool; SVM algorithm; Slaney database; audio recordings; empirical mode decomposition; intrinsic mode functions; k-nearest neighbour; speech regions discrimination; speech vs music discrimination; statistical measures; support vector machines; Databases; Empirical mode decomposition; Feature extraction; Multiple signal classification; Speech; Speech processing; Support vector machines; EMD; IMF; music; speech;
Conference_Titel :
Communications (NCC), 2015 Twenty First National Conference on
Conference_Location :
Mumbai
DOI :
10.1109/NCC.2015.7084865