Title :
Speaker identification with distant microphone speech
Author :
Jin, Qin ; Li, Runxin ; Yang, Qian ; Laskowski, Kornel ; Schultz, Tanja
Author_Institution :
Language Technol. Inst., Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
The field of speaker identification has recently seen significant advancement, but improvements have tended to be benchmarked on near-field speech, ignoring the more realistic setting of far-field-instrumented speakers. In this work we present several findings on far-field speech from the MIXER5 Corpus, in the areas of feature extraction, speaker modeling, and multichannel score combination. First, we observe that minimum-variance distortionless response (MVDR) features outperform Mel-frequency cepstral coefficient (MFCC) features, and that fundamental frequency variation (FFV) features offer complimentary information to both MFCC and MVDR features. Second, we present evidence that factor analysis significantly improves system performance, compared to the more traditional GMM/UBM strategy. Third, we find that frame-based score competition significantly improves performance under mismatched conditions with multiple channels available.
Keywords :
cepstral analysis; feature extraction; speaker recognition; MIXER5 Corpus; feature extraction; fundamental frequency variation; mel-frequency cepstral coefficient; microphone speech; minimum-variance distortionless response; multichannel score combination; speaker identification; speaker modeling; Acoustic distortion; Cepstral analysis; Filter bank; Loudspeakers; Mel frequency cepstral coefficient; Microphones; Natural languages; Speaker recognition; Speech analysis; Speech recognition; Distant Speech; Factor Analysis; Far-field Speech; Front-end Features; Speaker Identification;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495590