DocumentCode
310540
Title
Spectral subtraction and RASTA-filtering in text-dependent HMM-based speaker verification
Author
Hardt, Detlef ; Fellbaum, Klaus
Author_Institution
Inst. of Telecommun. & Theor. Electr. Eng., Tech. Univ. Berlin, Germany
Volume
2
fYear
1997
fDate
21-24 Apr 1997
Firstpage
867
Abstract
In real text-dependent telephone-based speaker verification systems, both, additive and convolutional noise influence the error rate considerably. In this paper, different procedures which make a speaker verification system more robust against noise are compared. We either use spectral subtraction in addition to MFCC-feature extraction or only PLP and RASTA-PLP (without spectral subtraction). Considering spectral subtraction two modifications were examined: one version which was preconnected to the system and a second one being integrated into the MFCC computation. The first version has the advantage that the window length can be chosen independently of those of the MFCC procedure. This led to better results. However, the most effective procedure for telephone speech data is the J-RASTA-PLP, but the estimation of the optimal J factor is difficult. At first we used a fixed J factor based on off-line measurement of noise power. Finally, we performed some experiments to optimize the system with the adaptive estimation of the J factor during utterance. This procedure is based on the method of spectral mapping which has been shown to be very effective in automatic speech recognition
Keywords
acoustic filters; acoustic noise; adaptive estimation; error statistics; feature extraction; hidden Markov models; optimisation; speaker recognition; spectral analysis; telephony; J-RASTA-PLP; MFCC-feature extraction; PLP; RASTA-PLP; RASTA-filtering; adaptive estimation; additive noise; automatic speech recognition; convolutional noise; error rate; noise power; optimal J factor; real text-dependent telephone-based speaker verification systems; spectral mapping; spectral subtraction; telephone speech data; text-dependent HMM-based speaker verification; utterance; window length; Additive noise; Automatic speech recognition; Error analysis; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Noise robustness; Spatial databases; Speech enhancement; Telephony;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
Conference_Location
Munich
ISSN
1520-6149
Print_ISBN
0-8186-7919-0
Type
conf
DOI
10.1109/ICASSP.1997.596073
Filename
596073
Link To Document