Noise robust estimate of speech dynamics for speaker recognition

Author

Openshaw, J.P. ; Mason, J.S.

Author_Institution

Dept. of Electr. Eng., Univ. Coll. of Swansea, UK

Volume

2

fYear

1996

fDate

3-6 Oct 1996

Firstpage

925

Abstract

The paper investigates the robustness of cepstral based features with respect to additive noise, and details two methods of increasing the robustness with minimal need for a-priori knowledge of the noise statistics. The first approach is a form of noise masking which adds a fixed offset to the linear spectral estimate. The second is a form of sub-band filtering, again in the linear domain, which estimates the dynamic content of the speech using Fourier transforms. This avoids negative values normally inherent in such filtering and which presents difficulties in deriving log estimates. Both methods are shown to provide useful levels of robustness to additive noise, for example, speaker identification error rates in SNR mis-matched conditions of 15 dB are reduced from 60.5% for standard mel cepstra to 13.3% and 24.1% for the two approaches respectively

Keywords

Fourier transforms; cepstral analysis; filters; noise; speaker recognition; Fourier transforms; additive noise; cepstral based feature robustness; dynamic speech content estimation; fixed offset; linear spectral estimate; log estimates; mis-matched conditions; noise masking; noise robust estimate; speaker identification error rates; speaker recognition; speech dynamics; standard mel cepstra; sub-band filtering; Additive noise; Cepstral analysis; Error analysis; Filtering; Fourier transforms; Noise robustness; Nonlinear filters; Speaker recognition; Speech enhancement; Statistics;

fLanguage

English

Publisher

ieee

Conference_Titel

Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on

Conference_Location

Philadelphia, PA

Print_ISBN

0-7803-3555-4

Type

conf

DOI

10.1109/ICSLP.1996.607753

Filename

607753