Title :
A Nonparametric Bayesian Multipitch Analyzer Based on Infinite Latent Harmonic Allocation
Author :
Yoshii, Kazuyoshi ; Goto, Masataka
Author_Institution :
Nat. Inst. of Adv. Ind. Sci. & Technol. (AIST), Tsukuba, Japan
fDate :
3/1/2012 12:00:00 AM
Abstract :
The statistical multipitch analyzer described in this paper estimates multiple fundamental frequencies (F0s) in polyphonic music audio signals produced by pitched instruments. It is based on hierarchic4al nonparametric Bayesian models that can deal with uncertainty of unknown random variables such as model complexities (e.g., the number of F0s and the number of harmonic partials), model parameters (e.g., the values of F0s and the relative weights of harmonic partials), and hyperparameters (i.e., prior knowledge on complexities and parameters). Using these models, we propose a statistical method called infinite latent harmonic allocation (iLHA). To avoid model-complexity control, we allow the observed spectra to contain an unbounded number of sound sources (F0s), each of which is allowed to contain an unbounded number of harmonic partials. More specifically, to model a set of time-sliced spectra, we formulated nested infinite Gaussian mixture models based on hierarchical and generalized Dirichlet processes. To avoid manual tuning of influential hyperparameters, we put noninformative hyperprior distributions on them in a hierarchical manner. For efficient Bayesian inference, we used a modern technique called collapsed variational Bayes. In comparative experiments using audio recordings of piano and guitar solo performances, iLHA yielded promising results and we found that there would be room for improvement based on modeling of temporal continuity and spectral smoothness.
Keywords :
Bayes methods; Gaussian processes; audio signal processing; music; statistical analysis; Bayesian inference; collapsed variational Bayes; generalized Dirichlet process; hierarchical nonparametric Bayesian model; hyperparameters; infinite latent harmonic allocation; model complexities; model parameters; model-complexity control; multiple fundamental frequencies; nested infinite Gaussian mixture model; nonparametric Bayesian multipitch analyzer; pitched instruments; polyphonic music audio signals; random variables; sound sources; statistical multipitch analyzer; time-sliced spectra; Bayesian methods; Complexity theory; Harmonic analysis; Hidden Markov models; Probabilistic logic; Psychoacoustic models; Uncertainty; Bayesian nonparametrics; Dirichlet process; infinite latent harmonic allocation (iLHA); multipitch analysis;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2011.2164530