Title :
Frequency offset correction in single sideband speech for speaker verification
Author :
Hua Xing ; Loizou, Philipos C. ; Hansen, John H. L.
Author_Institution :
Dept. of Electr. Eng., Univ. of Texas at Dallas, Richardson, TX, USA
Abstract :
Communication system mismatch represents a major influence for loss in speaker recognition performance. While microphone and handset differences have been considered in the NIST SRE, nonlinear communication system differences, such as modulation/demodulation (Mod/DeMod) carrier drift, have yet to be considered. In this study, an algorithm for estimating and correcting Mod/DeMod frequency offsets distortion in signal sideband modulation (SSB) speech is formulated based on two processing steps. In the first step, the offset of speech can be roughly scaled to a small frequency interval, which eliminates the ambiguity caused by periodicity of the spectrum. The second step performs fine-tuning within the pre-determined interval. For the first time, a statistical framework is developed for unique interval detection, where an innovative acoustic feature is proposed to represent different offsets and state-of-the-art techniques, the total variety method and PLDA, are applied. Speaker recognition experiments on SSB speech obtained from DAPPA RATS corpus show that a significant performance improvement (up to 50% relative improvement in EER) for speaker verification in SSB speech can be obtained by the proposed estimation and compensation method.
Keywords :
demodulation; modulation; speaker recognition; statistical analysis; DAPPA RATS corpus; Mod-DeMod carrier drift; Mod-DeMod frequency offset distortion; NIST SRE; PLDA; SSB speech; communication system mismatch; frequency offset correction; handset differences; innovative acoustic feature; microphone; modulation-demodulation carrier drift; nonlinear communication system differences; pre-determined interval; signal sideband modulation speech; single sideband speech; small frequency interval; speaker recognition performance; speaker verification; spectrum periodicity; statistical framework; unique interval detection; Amplitude modulation; Estimation; Frequency estimation; Mel frequency cepstral coefficient; Speaker recognition; Speech; Training; MFCC; PLDA; SSB; frequency offset; i-Vector; speaker verification;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6854357