• DocumentCode
    310540
  • Title

    Spectral subtraction and RASTA-filtering in text-dependent HMM-based speaker verification

  • Author

    Hardt, Detlef ; Fellbaum, Klaus

  • Author_Institution
    Inst. of Telecommun. & Theor. Electr. Eng., Tech. Univ. Berlin, Germany
  • Volume
    2
  • fYear
    1997
  • fDate
    21-24 Apr 1997
  • Firstpage
    867
  • Abstract
    In real text-dependent telephone-based speaker verification systems, both, additive and convolutional noise influence the error rate considerably. In this paper, different procedures which make a speaker verification system more robust against noise are compared. We either use spectral subtraction in addition to MFCC-feature extraction or only PLP and RASTA-PLP (without spectral subtraction). Considering spectral subtraction two modifications were examined: one version which was preconnected to the system and a second one being integrated into the MFCC computation. The first version has the advantage that the window length can be chosen independently of those of the MFCC procedure. This led to better results. However, the most effective procedure for telephone speech data is the J-RASTA-PLP, but the estimation of the optimal J factor is difficult. At first we used a fixed J factor based on off-line measurement of noise power. Finally, we performed some experiments to optimize the system with the adaptive estimation of the J factor during utterance. This procedure is based on the method of spectral mapping which has been shown to be very effective in automatic speech recognition
  • Keywords
    acoustic filters; acoustic noise; adaptive estimation; error statistics; feature extraction; hidden Markov models; optimisation; speaker recognition; spectral analysis; telephony; J-RASTA-PLP; MFCC-feature extraction; PLP; RASTA-PLP; RASTA-filtering; adaptive estimation; additive noise; automatic speech recognition; convolutional noise; error rate; noise power; optimal J factor; real text-dependent telephone-based speaker verification systems; spectral mapping; spectral subtraction; telephone speech data; text-dependent HMM-based speaker verification; utterance; window length; Additive noise; Automatic speech recognition; Error analysis; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Noise robustness; Spatial databases; Speech enhancement; Telephony;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
  • Conference_Location
    Munich
  • ISSN
    1520-6149
  • Print_ISBN
    0-8186-7919-0
  • Type

    conf

  • DOI
    10.1109/ICASSP.1997.596073
  • Filename
    596073