Title :
Filterbank slope based features for speaker diarization
Author :
Madikeri, Srikanth ; Bourlard, Herve
Author_Institution :
Idiap Res. Inst., Martigny, Switzerland
Abstract :
In this paper, filterbank slope based features are applied to the Information Bottleneck based system for speaker diarization. The filterbank slope based features have shown promise in the context of speaker recognition systems owing to their ability to emphasize formants. Hence, it is proposed to study their use in the context of speaker diarization as well, where speaker discrimination is equally important. The feature is explored using two different filterbank arrangements, linear and Mel, to form the Linear Filterbank Slope (LFS) and Mel Filterbank Slope (MFS), respectively. Both arrangements are shown to be inherently better at speaker discrimination compared with MFCC (Mel Frequency Cepstral Co-efficients). The feature streams are tested on the NIST RT06, 07 and 09 datasets. A best case relative improvement of 22.1% and 37.1% is observed for LFS and MFS, respectively, when compared with the MFCC-based baseline. The combination with time domain features is also studied and further improvements are observed. Finally, results on the fusion of multiple features are presented.
Keywords :
channel bank filters; speaker recognition; time-domain analysis; LFS; MFCC-based baseline; MFS; Mel filterbank slope; Mel frequency cepstral coefficients; NIST RT06, 07 datasets; NIST RT06, 09 datasets; filterbank slope based features; information bottleneck based system; linear filterbank slope; speaker diarization; speaker discrimination; speaker recognition systems; time domain features; Feature extraction; Filter banks; Hidden Markov models; Mel frequency cepstral coefficient; Speaker recognition; Speech; Filterbank slope; Information Bottleneck; Speaker Diarization;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6853568