• DocumentCode
    3144780
  • Title

    Bayesian nonparametric music parser

  • Author

    Nakano, Masahiro ; Ohishi, Yasunori ; Kameoka, Hirokazu ; Mukai, Ryo ; Kashino, Kunio

  • Author_Institution
    NTT Commun. Sci. Labs., NTT Corp., Atsugi, Japan
  • fYear
    2012
  • fDate
    25-30 March 2012
  • Firstpage
    461
  • Lastpage
    464
  • Abstract
    This paper proposes a novel representation of music that can be used for similarity-based music information retrieval, and also presents a method that converts an input polyphonic audio signal to the proposed representation. The representation involves a 2-dimensional tree structure, where each node encodes the musical note and the dimensions correspond to the time and simultaneous multiple notes, respectively. Since the temporal structure and the synchrony of simultaneous events are both essential in music, our representation reflects them explicitly. In the conventional approaches to music representation from audio, note extraction is usually performed prior to structure analysis, but accurate note extraction has been a difficult task. In the proposed method, note extraction and structure estimation is performed simultaneously and thus the optimal solution is obtained with a unified inference procedure. That is, we propose an extended 2-dimensional infinite probabilistic context-free grammar and a sparse factor model for spectrogram analysis. An efficient inference algorithm, based on Markov chain Monte Carlo sampling and dynamic programming, is presented. The experimental results show the effectiveness of the proposed approach.
  • Keywords
    Bayes methods; Markov processes; Monte Carlo methods; audio signal processing; dynamic programming; electronic music; information retrieval; 2-dimensional infinite probabilistic context-free grammar; 2-dimensional tree structure; Bayesian nonparametric music parser; Markov chain Monte Carlo sampling; dynamic programming; inference algorithm; music representation; musical note encoding; note extraction; polyphonic audio signal; similarity-based music information retrieval; sparse factor; spectrogram analysis; structure estimation; temporal structure; unified inference procedure; Abstracts; Bayesian methods; Biological system modeling; Continuous wavelet transforms; Indexes; Markov chain Monte Carlo (MCMC); hierarchical Dirichlet process (HDP); infinite probabilistic context-free grammar (infinite PCFG); nonnegative matrix factorization (NMF);
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
  • Conference_Location
    Kyoto
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4673-0045-2
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2012.6287916
  • Filename
    6287916