• DocumentCode
    66749
  • Title

    Sequential Complexity as a Descriptor for Musical Similarity

  • Author

    Foster, Peter ; Mauch, Matthias ; Dixon, Simon

  • Author_Institution
    Sch. of Electron. Eng. & Comput. Sci., Queen Mary Univ. of London, London, UK
  • Volume
    22
  • Issue
    12
  • fYear
    2014
  • fDate
    Dec. 2014
  • Firstpage
    1965
  • Lastpage
    1977
  • Abstract
    We propose string compressibility as a descriptor of temporal structure in audio, for the purpose of determining musical similarity. Our descriptors are based on computing track-wise compression rates of quantized audio features, using multiple temporal resolutions and quantization granularities. To verify that our descriptors capture musically relevant information, we incorporate our descriptors into similarity rating prediction and song year prediction tasks. We base our evaluation on a dataset of 15 500 track excerpts of Western popular music, for which we obtain 7 800 web-sourced pairwise similarity ratings. To assess the agreement among similarity ratings, we perform an evaluation under controlled conditions, obtaining a rank correlation of 0.33 between intersected sets of ratings. Combined with bag-of-features descriptors, we obtain performance gains of 31.1% and 10.9% for similarity rating prediction and song year prediction. For both tasks, analysis of selected descriptors reveals that representing features at multiple time scales benefits prediction accuracy.
  • Keywords
    audio coding; data compression; music; quantisation (signal); Web-sourced pairwise similarity ratings; Western popular music; audio temporal structure; bag-of-feature descriptors; multiple temporal resolutions; multiple time scale benefit prediction accuracy; musical similarity descriptor; quantization granularities; quantized audio features; rank correlation; sequential complexity; similarity rating prediction; song year prediction tasks; string compressibility; track-wise compression rates; Complexity theory; Feature extraction; IEEE transactions; Quantization (signal); Speech; Speech processing; Vectors; Music content analysis; musical similarity measures; time series complexity;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2014.2357676
  • Filename
    6897943