Title :
Enhancing timbre model using MFCC and its time derivatives for music similarity estimation
Author :
De Leon, Franz ; Martinez, Kirk
Author_Institution :
Electron. & Comput. Sci., Univ. of Southampton, Southampton, UK
Abstract :
One of the popular methods for content-based music similarity estimation is to model timbre with MFCC as a single multivariate Gaussian with full covariance matrix, then use symmetric Kullback-Leibler divergence. From the field of speech recognition, we propose to use the same approach on the MFCCs´ time derivatives to enhance the timbre model. The Gaussian models for the delta and acceleration coefficients are used to create their respective distance matrix. The distance matrices are then combined linearly to form a full distance matrix for music similarity estimation. In our experiments on two datasets, our novel approach performs better than using MFCC alone. Moreover, performing genre classification using k-NN showed that the accuracies obtained are already close to the state-of-the-art.
Keywords :
Gaussian processes; matrix algebra; music; signal classification; speech recognition; Gaussian models; MFCC; acceleration coefficients; content-based music similarity estimation; delta coefficients; enhancing timbre model; full covariance matrix; full distance matrix; genre classification; k-NN; single multivariate Gaussian process; speech recognition; symmetric Kullback-Leibler divergence; time derivatives; Acceleration; Computational modeling; Covariance matrix; Estimation; Mel frequency cepstral coefficient; Timbre; MFCC; music similarity estimation;
Conference_Titel :
Signal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European
Conference_Location :
Bucharest
Print_ISBN :
978-1-4673-1068-0