NMF With Time–Frequency Activations to Model Nonstationary Audio Events

Author

Hennequin, Romain ; Badeau, Roland ; David, Bertrand

Author_Institution

LTCI, Telecom ParisTech, Paris, France

Volume

19

Issue

4

fYear

2011

fDate

5/1/2011 12:00:00 AM

Firstpage

744

Lastpage

753

Abstract

Real-world sounds often exhibit time-varying spectral shapes, as observed in the spectrogram of a harpsichord tone or that of a transition between two pronounced vowels. Whereas the standard non-negative matrix factorization (NMF) assumes fixed spectral atoms, an extension is proposed where the temporal activations (coefficients of the decomposition on the spectral atom basis) become frequency dependent and follow a time-varying autoregressive moving average (ARMA) modeling. This extension can thus be interpreted with the help of a source/filter paradigm and is referred to as source/filter factorization. This factorization leads to an efficient single-atom decomposition for a single audio event with strong spectral variation (but with constant pitch). The new algorithm is tested on real audio data and shows promising results.

Keywords

audio signal processing; matrix decomposition; spectrometers; time-frequency analysis; NMF; harpsichord tone; nonstationary audio event; single-atom decomposition; source-filter factorization; source-filter paradigm; spectral variation; spectrogram; standard nonnegative matrix factorization; temporal activation; time frequency activation; time varying spectral shape; time-varying autoregressive moving average modeling; Music information retrieval (MIR); non-negative matrix factorization (NMF); unsupervised machine learning;

fLanguage

English

Journal_Title

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher

ieee

ISSN

1558-7916

Type

jour

DOI

10.1109/TASL.2010.2062506

Filename

5535132