DocumentCode :
417116
Title :
Using Haar transformed vocal source information for automatic speaker recognition
Author :
Zheng, Nengheng ; Ching, P.C.
Author_Institution :
Dept. of Electron. Eng., Chinese Univ. of Hong Kong, Shatin, China
Volume :
1
fYear :
2004
fDate :
17-21 May 2004
Abstract :
This paper attempts to investigate the effectiveness of incorporating vocal source information for enhancing automatic speaker recognition accuracy. We propose a new method to extract discriminative features from the linear prediction (LP) residual signal, which are closely related to the glottal excitation of individual speaker. A complementary parameter set in addition to the commonly used linear predictive cepstral coefficients (LPCC), called Haar octave coefficients of residue (HOCOR), is obtained by applying a Haar transform to the LP residue. This additional feature vector retains the spectro-temporal characteristics of the source excitation sequences that are related to the fundamental frequency, harmonics, as well as their phases. Experimental evaluation over the YOHO corpus demonstrates the high speaker discriminative power and high inter-speaker variability of HOCOR. Speaker recognition tests with both vocal tract feature (LPCC) and vocal source information (HOCOR) outperform the conventional methods of using LPCC only.
Keywords :
Haar transforms; cepstral analysis; feature extraction; speaker recognition; time-frequency analysis; HOCOR; Haar octave coefficients of residue; Haar transformed vocal source information; LPCC; automatic speaker recognition; discriminative feature extraction; individual speaker glottal excitation; inter-speaker variability; linear prediction residual signal; linear predictive cepstral coefficients; residue time-frequency analysis; source excitation sequence spectro-temporal characteristics; speaker discriminative power; vocal tract features; Automatic speech recognition; Cepstral analysis; Data mining; Feature extraction; Fourier transforms; Mel frequency cepstral coefficient; Partial response channels; Speaker recognition; Testing; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-8484-9
Type :
conf
DOI :
10.1109/ICASSP.2004.1325926
Filename :
1325926
Link To Document :
بازگشت