Modified lasso screening for audio word-based music classification using large-scale dictionary

Author

Ping-Keng Jao ; Yeh, Chin-Chia Michael ; Yi-Hsuan Yang

Author_Institution

Res. Center for Inf. Technol. Innovation, Acad. Sinica, Taipei, Taiwan

fYear

2014

fDate

4-9 May 2014

Firstpage

5207

Lastpage

5211

Abstract

Representing music information using audio codewords has led to state-of-the-art performance on various music classifcation benchmarks. Comparing to conventional audio descriptors, audio words offer greater fexibility in capturing the nuance of music signals, in that each codeword can be viewed as a quantization of the music universe and that the quantization goes finer as the size of the dictionary (i.e., audio codebook) increases. In practice, however, the high computational cost of codeword assignment might discourage the use of a large dictionary. This paper presents two modifications of a LASSO screening technique developed in the compressive sensing field to speed up the codeword assignment process. The first modification exploits the repetitive nature of music signals, whereas the second one relaxes a screening constraint that is specific to reconstruction but not for classifcation. Our experiments show that the proposed method enables the use of a dictionary of 10,000 codewords with runtime close to the case of using a dictionary of 1,000 codewords. Moreover, using the larger dictionary significantly improves the mean average precision (MAP) from 0.219 to 0.246 for tagging thousands of tracks with 147 possible genre tags.

Keywords

audio coding; compressed sensing; music; signal classification; LASSO screening technique; audio codebook; audio codeword; audio word-based music classification; compressive sensing field; large scale dictionary; modified lasso screening; music signal; music universe quantization; Accuracy; Dictionaries; Encoding; Multiple signal classification; Music; Support vector machines; Tagging; LASSO screening; Sparse coding; feature learning; genre classifcation; music information retrieval;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location

Florence

Type

conf

DOI

10.1109/ICASSP.2014.6854596

Filename

6854596