DocumentCode
1846739
Title
Enhancing timbre model using MFCC and its time derivatives for music similarity estimation
Author
De Leon, Franz ; Martinez, Kirk
Author_Institution
Electron. & Comput. Sci., Univ. of Southampton, Southampton, UK
fYear
2012
fDate
27-31 Aug. 2012
Firstpage
2005
Lastpage
2009
Abstract
One of the popular methods for content-based music similarity estimation is to model timbre with MFCC as a single multivariate Gaussian with full covariance matrix, then use symmetric Kullback-Leibler divergence. From the field of speech recognition, we propose to use the same approach on the MFCCs´ time derivatives to enhance the timbre model. The Gaussian models for the delta and acceleration coefficients are used to create their respective distance matrix. The distance matrices are then combined linearly to form a full distance matrix for music similarity estimation. In our experiments on two datasets, our novel approach performs better than using MFCC alone. Moreover, performing genre classification using k-NN showed that the accuracies obtained are already close to the state-of-the-art.
Keywords
Gaussian processes; matrix algebra; music; signal classification; speech recognition; Gaussian models; MFCC; acceleration coefficients; content-based music similarity estimation; delta coefficients; enhancing timbre model; full covariance matrix; full distance matrix; genre classification; k-NN; single multivariate Gaussian process; speech recognition; symmetric Kullback-Leibler divergence; time derivatives; Acceleration; Computational modeling; Covariance matrix; Estimation; Mel frequency cepstral coefficient; Timbre; MFCC; music similarity estimation;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European
Conference_Location
Bucharest
ISSN
2219-5491
Print_ISBN
978-1-4673-1068-0
Type
conf
Filename
6333837
Link To Document