Computational scene analysis of cocktail-party situations based on sequential Monte Carlo methods

Author

Nix, Johannes ; Kleinschmidt, Michael ; Hohmann, Volker

Author_Institution

Arbeitsgruppe Medizinische Phys., Oldenburg Univ., Germany

Volume

1

fYear

2003

fDate

9-12 Nov. 2003

Firstpage

735

Abstract

A frequent demand for noise suppression in digital hearing aids is speech enhancement in noisy multi-talker conditions. Whereas multi-microphone array-processing techniques employing a stationary or slowly varying directivity yield an improvement in intelligibility, binaural noise suppression algorithms using the two signals recorded at the left and right ears have not yet been shown to yield significant benefit in complex acoustical environments. We therefore explore the approach to integrate principles of auditory scene analysis in speech enhancement algorithms. From psychoacoustics it is known that common onsets, common amplitude modulation and sound source direction are among the important cues used for source separation by the human auditory system. However it is largely unknown how the ´binding´ of different cues may work. A possible approach to tackle the binding problem is proposed in this paper. A new algorithm is presented, which performs statistical estimation of different sources by a state-space approach, which integrates temporal and frequency-specific features of speech. It is based on a sequential Monte Carlo (SMC) scheme and tracks magnitude spectra and direction on a frame-by-frame basis using binaural signals. This is achieved by integrating empirically measured high-dimensional statistics of speech and directional information from head-related transfer functions. Results for estimating sound source direction of a moving voice and spectral envelopes of two voices are shown. The results indicate that the algorithm is able to localize two superimposed sound sources and separate their spectral envelope on-line with adaption times of about 50 ms, which is much faster than typical blind source separation algorithms.

Keywords

Monte Carlo methods; array signal processing; hearing aids; microphones; signal denoising; speech enhancement; 50 ms; auditory scene analysis; cocktail-party situations; digital hearing aids; multimicrophone array-processing techniques; noise suppression; sequential Monte Carlo methods; speech enhancement; statistical estimation; Acoustic noise; Amplitude modulation; Ear; Frequency estimation; Hearing aids; Image analysis; Psychoacoustics; Source separation; Speech enhancement; Working environment noise;

fLanguage

English

Publisher

ieee

Conference_Titel

Signals, Systems and Computers, 2004. Conference Record of the Thirty-Seventh Asilomar Conference on

Print_ISBN

0-7803-8104-1

Type

conf

DOI

10.1109/ACSSC.2003.1292011

Filename

1292011