• DocumentCode
    179470
  • Title

    Modified lasso screening for audio word-based music classification using large-scale dictionary

  • Author

    Ping-Keng Jao ; Yeh, Chin-Chia Michael ; Yi-Hsuan Yang

  • Author_Institution
    Res. Center for Inf. Technol. Innovation, Acad. Sinica, Taipei, Taiwan
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    5207
  • Lastpage
    5211
  • Abstract
    Representing music information using audio codewords has led to state-of-the-art performance on various music classifcation benchmarks. Comparing to conventional audio descriptors, audio words offer greater fexibility in capturing the nuance of music signals, in that each codeword can be viewed as a quantization of the music universe and that the quantization goes finer as the size of the dictionary (i.e., audio codebook) increases. In practice, however, the high computational cost of codeword assignment might discourage the use of a large dictionary. This paper presents two modifications of a LASSO screening technique developed in the compressive sensing field to speed up the codeword assignment process. The first modification exploits the repetitive nature of music signals, whereas the second one relaxes a screening constraint that is specific to reconstruction but not for classifcation. Our experiments show that the proposed method enables the use of a dictionary of 10,000 codewords with runtime close to the case of using a dictionary of 1,000 codewords. Moreover, using the larger dictionary significantly improves the mean average precision (MAP) from 0.219 to 0.246 for tagging thousands of tracks with 147 possible genre tags.
  • Keywords
    audio coding; compressed sensing; music; signal classification; LASSO screening technique; audio codebook; audio codeword; audio word-based music classification; compressive sensing field; large scale dictionary; modified lasso screening; music signal; music universe quantization; Accuracy; Dictionaries; Encoding; Multiple signal classification; Music; Support vector machines; Tagging; LASSO screening; Sparse coding; feature learning; genre classifcation; music information retrieval;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6854596
  • Filename
    6854596