Robust speech features based on wavelet transform with application to speaker identification

Author

Hsieh, C.-T. ; Lai, E. ; Wang, Y.-C.

Author_Institution

Dept. of Electr. Eng., Tamkang Univ., Taipei, Taiwan

Volume

149

Issue

2

fYear

2002

fDate

4/1/2002 12:00:00 AM

Firstpage

108

Lastpage

114

Abstract

An effective and robust speech feature extraction method is presented. Based on the time-frequency multiresolution property of the wavelet transform, the input speech signal is decomposed into various frequency channels. For capturing the characteristics of an individual speaker, the linear predictive cepstral coefficients of the approximation channel and entropy value of the detail channel for each decomposition process are calculated. In addition, an adaptive thresholding technique for each lower resolution is also applied to remove the influence of noise interference. Experimental results show that using this mechanism not only effectively reduces the influence of noise interference but also improves the recognition performance. Finally, the proposed method is evaluated on the MAT telephone speech database for text-independent speaker identification using the group vector quantisation identifier. Some popular existing methods are also evaluated for comparison, and the results show that the proposed feature extraction algorithm is more effective and robust than the other existing methods. In addition, the performance of the proposed method is very satisfactory even in a low SNR environment corrupted by Gaussian white noise.

Keywords

cepstral analysis; entropy; feature extraction; speaker recognition; time-frequency analysis; vector quantisation; wavelet transforms; Gaussian white noise; MAT telephone speech database; adaptive thresholding technique; approximation channel; detail channel; entropy value; feature extraction; frequency channels; group vector quantisation identifier; linear predictive cepstral coefficients; low SNR environment; noise interference; robust speech; speaker identification; text-independent identification; time-frequency multiresolution property; wavelet transform;

fLanguage

English

Journal_Title

Vision, Image and Signal Processing, IEE Proceedings -

Publisher

iet

ISSN

1350-245X

Type

jour

DOI

10.1049/ip-vis:20020121

Filename

1018001