Title :
Towards real-time music auto-tagging using sparse features
Author_Institution :
Res. Center for IT Innovation, Acad. Sinica, Taipei, Taiwan
Abstract :
Unsupervised feature learning algorithms such as sparse coding and deep belief networks have been shown a viable alternative to hand-crafted feature design for music information retrieval. Nevertheless, such algorithms are usually computationally expensive. This paper investigates techniques to accelerate sparse feature extraction and music classification. To study the trade-off between computational efficiency and accuracy, we compare state-of-the-art, dense audio features with sparse features computed using 1) sparse coding with a random dictionary, 2) randomized clustering forest, and 3) an extension of randomized clustering forest to temporal signals. For classifier training and prediction, we compare support vector machines with linear or non-linear kernel functions. We conduct evaluation on music auto-tagging for 140 genre/style tags using a subset of 7,799 songs of the CAL10k data set. Our result leads to an 11-fold speed increase with 3.45% accuracy loss comparing to dense features. With the proposed sparse features, the feature extraction and auto-tagging operations can be finished in 1 second per song, with 0.1302 tagging accuracy in mean average precision.
Keywords :
feature extraction; information retrieval; music; support vector machines; CAL10k data set; auto-tagging operations; classifier prediction; classifier training; computational efficiency; deep belief networks; dense audio features; hand-crafted feature design; mean average precision; music classification; music information retrieval; nonlinear kernel functions; random dictionary; randomized clustering forest; real-time music auto-tagging; sparse coding; sparse feature extraction; sparse features; support vector machines; temporal signals; unsupervised feature learning algorithms; Abstracts; Lead; Mel frequency cepstral coefficient; Unsupervised feature learning; music auto-tagging; randomized clustering forest; sparse coding;
Conference_Titel :
Multimedia and Expo (ICME), 2013 IEEE International Conference on
Conference_Location :
San Jose, CA
DOI :
10.1109/ICME.2013.6607505