Multiband audio modeling for single-channel acoustic source separation

Author

Reyes-Gomez, Manuel J. ; Ellis, Daniel P W ; Jojic, Nebojsa

Author_Institution

Dept. of Electr. Eng., Columbia Univ., New York, NY, USA

Volume

5

fYear

2004

fDate

17-21 May 2004

Abstract

Detailed hidden Markov models (HMMs) that capture the constraints implicit in a particular sound can be used to estimate obscured or corrupted portions from partial observations, the situation encountered when trying to identify multiple, overlapping sounds. However, when the complexity and variability of the sounds are high, as in a particular speaker´s voice, a detailed model might require several thousand states to cover the full range of different short-term spectra with adequate resolution. To address the tractability problems of such large models, we break the source signals into multiple frequency bands, and build separate but coupled HMMs for each band, requiring many fewer states per model. To prevent non-natural full spectral states and to enforce consistency within and between bands, at any given frame, the state in a particular band is determined by the previous state in that band and the states in the adjacent bands. Coupling the bands in this manner results in a grid like model for the full spectrum. Since exact inference of such a model is intractable, we derive an efficient approximation based on variational methods. Results in source separation of combined signals modeled with this approach outperform the separation obtained by full-band models.

Keywords

audio signal processing; hidden Markov models; source separation; variational techniques; combined signal source separation; coupled HMM; hidden Markov models; inter-band consistency; intra-band consistency; multiband audio modeling; multiband speech models; multiple frequency bands; multiple overlapping sounds; single-channel acoustic source separation; spectrum grid like model; variational approximation methods; Automatic speech recognition; Frequency synchronization; Graphical models; Hidden Markov models; Signal resolution; Source separation;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on

ISSN

1520-6149

Print_ISBN

0-7803-8484-9

Type

conf

DOI

10.1109/ICASSP.2004.1327192

Filename

1327192