• DocumentCode
    3518302
  • Title

    On the importance of modeling temporal information in music tag annotation

  • Author

    Reed, Jeremy ; Lee, Chin-Hui

  • Author_Institution
    Sch. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA
  • fYear
    2009
  • fDate
    19-24 April 2009
  • Firstpage
    1873
  • Lastpage
    1876
  • Abstract
    Music is an art form in which sounds are organized in time; however, current approaches for determining similarity and classification largely ignore temporal information. This paper presents an approach to automatic tagging which incorporates temporal aspects of music directly into the statistical models, unlike the typical bag-of-frames paradigm in traditional music information retrieval techniques. Vector quantization on song segments leads to a vocabulary of acoustic segment models. An unsupervised, iterative process that cycles between Viterbi decoding and Baum-Welch estimation builds transcripts of this vocabulary. Latent semantic analysis converts the song transcriptions into a vector for subsequent classification using a support vector machine for each tag. Experimental results demonstrate that the proposed approach performs better in 15 of the 18 tags. Further analysis demonstrates an ability to capture local timbral characteristics as well as sequential arrangements of acoustic segment models.
  • Keywords
    acoustic signal processing; hidden Markov models; information retrieval; iterative methods; music; signal classification; speech processing; statistical analysis; support vector machines; vector quantisation; Baum-Welch estimation; Viterbi decoding; acoustic segment model; bag-of-frames paradigm; iterative process; latent semantic analysis; local timbral characteristic; music information retrieval technique; music tag annotation; song segments; statistical model; support vector machine; vector quantization; Automatic speech recognition; Hidden Markov models; Multiple signal classification; Music information retrieval; Support vector machine classification; Support vector machines; Tagging; Technical Activities Guide -TAG; Vector quantization; Vocabulary; Hidden Markov models; Information retrieval; Music; Speech processing; Vector quantization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
  • Conference_Location
    Taipei
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-2353-8
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2009.4959973
  • Filename
    4959973